Thursday, 1st of October

8h30 - 9h30 Registration to the Zoom Platform
9h20 - 9h40 Welcoming to SIMBig 2020
9h40 - 10h00 Diagnosis of SARS-CoV-2 based on patient symptoms and diffuse classifiers Fray Luis Becerra Suarez, Herber Ivan Mejia Cabrera and Victor Alexci Tuesta Monteza
10h00 - 10h40
Keynote Speaker: Ian Horrocks

Title: Knowledge Graph Creation and Curation


Knowledge Graphs are increasingly important resources in science, medicine and industry, and systems for storing and querying Knowledge Graphs are becoming increasingly capable. However, creating and curating high quality knowledge is still a hard problem, and this could impede their adoption. In this talk we will consider the nature of the problem and survey some recent (and not so recent) work that attempts to address it.

Session 1: Semantic Web and Social Networks
10h40 - 11h00 Distributed Identity Management for Semantic Entities Falko Schönteich, Andreas Kasten and Ansgar Scherp
11h00 - 11h20 Telegram: Data Collection, Opportunities and Challenges Tuja Khaund, Muhammad Nihal Hussain, Mainuddin Shaik and Nitin Agarwal
11h20 - 11h40 Graph theory applied to International Code of Diseases (ICD) in a hospital Boldorini Jr. Claudio, Carlos Euzebio, Lucas Porto, Alexandre Martinez and Evandro Ruiz
11h40 - 12h00 CovidStream: Interactive Visualization of Emotions Evolution Associated with COVID-19 Herwin Alayn Huillcen Baca, Flor de Luz Palomino Valdivia, Yalmar Ponce Atencio, Manuel Ibarra Cabrera, Mario Aquino Cruz and Melvin Edward Huillcen Baca
14h00 - 14h40
Keynote Speaker: Dina Demner

Title: Retrieving information, answering questions, and detecting misinformation about COVID-19


One of the effects of COVID-19 pandemic is a rapidly growing and changing stream of publications to inform clinicians, researchers, policy makers, and patients about the health, socio-economic, and cultural consequences of the pandemic. Leveraging this stream of information is essential for developing policies, guidelines and strategies during the pandemic, for recovery after the COVID-19 pandemic, and for designing measures to prevent recurrence of similar threats. Managing this information stream manually is not feasible. Automated approaches are needed to quickly bring the most salient and reliable points to the readers’ attention. Leveraging the CORD-19 collection of scientific articles, data from government sites, and questions about SARS-CoV-2 asked by the researchers, clinicians and the public, in collaboration with the National Institute of Standards (NIST), Ai2 and UTHealth and OHSU researchers, we have developed datasets for retrieval of COVID-19 information and automatic question answering. These datasets allowed us to (1) conduct a community-wide evaluation of the systems’ ability to satisfy the needs for high-quality timely information about COVID-19; (2) develop deep-learning approaches to meeting information needs as they evolve during pandemics; and (3) develop approaches to detection of misinformation. This talk will present the approaches to dataset development, our systems for question answering and misinformation detection, and some preliminary results.

Session 2: Natural Language Processing and Text mining
14h40 - 15h00 Comparative Analysis of Question Answering Models for HRI Tasks with NAO in Spanish Enrique Burga-Gutierrez, Bryam Vasquez-Chauca and Willy Ugarte
15h00 - 15h20 Peruvian citizens reaction to Reactiva Perú Program: A Twitter Sentiment Analysis Approach Rosmery Ramos-Sandoval
15h20 - 15h40 Twitter Early Prediction of preferences and tendencies based in neighborhood behavior Emanuel Meriles, Martin Ariel Dominguez and Pablo Gabriel Celayes
15h40 - 16h00 Summarization of Twitter Events with Deep Neural Network Pre-trained Models Kunal Chakma, Amitava Das and Swapan Debbarma
16h00 - 16h20 Multi-strategic author names disambiguation in bibliography repositories Natan de Souza Rodrigues, Aurelio Ribeiro Costa, Lucas Correa Lemos and Celia Ghedini Ralha
16h20 - 16h40 Machine Learning techniques for speech emotion classification Noe Melo Locummber and Junior Fabian
16h40 - 17h00 An evaluation of physiological public datasets for emotion recognition systems Alexis Mendoza, Alvaro Cuno, Wilber Ramos and Nelly Condori-Fernandez
17h00 - 17h40
Keynote Speaker: Sophia Ananiadou

Title: Text mining methods for evidence-based medicine


Text mining methods have been used for the effective and timely identification of knowledge for evidence-based medicine. Evidence-based medicine seeks to answer well-posed questions using existing results by applying a systematic and transparent methodology. The evidence, e.g., results of clinical trials, should be collected and evaluated without bias, using consistent criteria for inclusion. Literature databases such as MEDLINE provide instant access to millions of research articles, obtained from an ever-expanding list of indexed journals. This progress comes at a cost. The majority of clinicians do not possess the expert literature search skills required to develop complex search strategies, and existing databases are manually updated. I will discuss methods for automating the process of systematic reviews, by automating the process of search, screening, and data extraction methods. These text mining methods have been integrated to the system RobotAnalyst currently used by NICE.

Friday, 2nd of October

8h30 - 9h00 Registration to the Zoom Platform
9h20 - 10h00
Keynote Speaker: Maguelonne Teisseire

Title: Text mining activities in the context of the MOOD project


he MOOD project ( aims at harnessing the data mining and analytical techniques to the big data originating from multiple sources to improve detection, monitoring, and assessment of emerging diseases in Europe. In this presentation, I will present part of the text mining activities conducted in the TETIS lab. In particular, we will see how text analysis could help to detect weak signal for epidemic outbreak detection.

Session 3: Machine Learning
10h00 - 10h20 YTTREX: crowdsourced analysis of Youtube's recommender system during COVID-19 pandemic Leonardo Sanna, Salvatore Romano, Giulia Corona and Claudio Agosti
10h20 - 10h40 Parallel Social Spider Optimization Algorithms with Island Model for the Clustering Problem Edwin Alvarez-Mamani, Lauro Enciso-Rodas, Mauricio Ayala-Rincón and José Luis Soncco-Álvarez
Session 4: Machine Learning
10h40 - 11h00 Two-class fuzzy clustering ensemble approach based on a constraint on fuzzy memberships Omid Aligholipour and Mehmet Kuntalp
11h00 - 11h20 Modeling and Predicting the Lima Stock Exchange General Index with Bayesian Networks and Information from Foreign Markets Daniel Chapi, Soledad Espezua Llerena, Julio Villavicencio, Oscar Enrique Miranda Castillo and Edwin Rafael Villanueva Talavera
11h20 - 11h40 Comparative study of spatial prediction models for estimating PM2.5 concentration level in urban areas Irvin Vargas Campos and Edwin Villanueva
11h40 - 12h00 Prediction of Solar Radiation using Forecasting Neural Networks Marcos Ponce-Jara, Alvaro Talavera, Carlos Velásquez and David Tonato
12h00 - 12h20 COVID-19 Infection Prediction and Classification Souad Taleb Zouggar and Abdelkader Adla
14h00 - 14h40
Keynote Speaker: Andrew Tomkins

Title: Incorporating Graph Data into Machine Learning


We perform data mining tasks today over huge and growing datasets. To handle scale, we rely on a handful of highly optimized primitive operations. The current workhorse primitive in training of ML models is stochastic gradient descent, which incorporates per-instance label data efficiently into a model's internal state. However, information is often available to us, not just as labels, but also through connections between data instances, often presented as graphs, sometimes in other forms. In this talk, Andrew Tomkins will describe a number of different approaches to incorporating such higher-order data into scalable training and inference, and will suggest some open problems in this area.

Session 5: Image processing
14h40 - 15h00 Towards a benchmark for sedimentary facies classication: Applied to the Netherlands F3 Block Maykol Jiampiers Campos Trinidad, Smith Washington Arauco Canchumuni and Marco Aurelio Cavalcanti Pacheco
15h00 - 15h20 Mobile application for movement recognition in the rehabilitation of the anterior cruciate ligament of the knee Iam Contreras-Alcázar, Kreyh Contreras-Alcázar and Victor Cornejo-Aparicio
15h20 - 15h40 Semantic Segmentation using Convolutional Neural Networks for Volume Estimation of Native Potatoes at High Speed Miguel Angel Chicchon Apaza and Ronny Michael Huerta Firma
15h40 - 16h00 Symbiotic Trackers’ Ensemble with Trackers’ Re-initialization for Face Tracking Victor Ayma, Patrick Happ, Raul Feitosa, Gilson Costa and Bruno Feijó
16h00 - 16h20 Multi-class Vehicle Detection and Automatic License Plate Recognition based on YOLO in Latin American Context Pedro Montenegro, Jhonatan Camasca and Junior Fabian
16h20 - 16h40 Static Summarization using Pearson’s Coefficient and Transfer Learning for Anomaly detection for Surveillance Videos Steve Chancolla-Neira, Cesar Salinas-Lozano and Willy Ugarte
16h40 - 17h00 Humpback Whale’s Flukes Segmentation Algorithms Andrea Jackeline Castro Cabanillas and Victor Hugo Ayma Quirita
17h00 - 17h20 Improving Context-Aware Music Recommender Systems with a Dual Recurrent Neural Network Igor Santana and Marcos Domingues

Saturday, 3rd of October

9h00 - 9h20 Registration to the Zoom Platform
9h20 - 10h00
Keynote Speaker: Francisco Rodrigues

Title: Predicting dynamical processes in complex networks: a machine learning approach


One of the most fundamental problems in Network Science is to understand how dynamical processes are influenced by the network organization. For instance, if we can understand how patterns of connections between coupled oscillators influence the evolution of the synchronous state, then we can change the network topology to control the level of synchronization of power grids and electronic circuits. In this talk, we will discuss how machine learning methods can be used to predict disease propagation in networks and the state of coupled oscillators. The current challenges in Network Science and some possible ideas for future research will also be discussed in our talk.

Session 6: Social Networks
10h00 - 10h20 Classification of Cybercrime Indicators in Open Social Data Ihsan Ullah, Caoilfhionn Lane, Teodora Buda, Brett Drury, Marc Mellotte, Haytham Assem and Michael Madden
10h20 - 10h40 StrCoBSP: relationship Strength-aware Community-based Social Profiling Asma Chader, Hamid Haddadou, Leila Hamdad and Walid-Khaled Hidouci
10h40 - 11h00 Identifying Differentiating Factors for Cyberbullying in Vine and Instagram Rahat Ibn Rafiq, Homa Hosseinmardi, Richard Han, Qin Lv and Shivakant Mishra
11h00 - 11h20 Effect of Social Algorithms on Media Source Publishers in Social Media Ecosystems Ittipon Rassameeroj and S. Felix Wu
11h20 - 11h40 The Identification of Framing Language in Business Leaders' Speech From the Mass Media Brett Drury and Samuel Morais Drury
11h40 - 12h20 Clustering Analysis of Website Usage on Twitter during the COVID-19 Pandemic Iain Cruickshank and Kathleen Carley
Session 7: Data-driven software engineering
14h00 - 14h20 Calibrated Viewability Prediction for Premium Inventory Expansion Jonathan Schler and Allon Hammer
14h20 - 14h40 Data Driven Policy Making: The Peruvian Water Resources Observatory Giuliana Barnuevo Reategui, Elsa Galarza, Maria Paz Herrera, Juan G. Lazo Lazo, Miguel Nunez-Del-Prado and Jose Luis Ruiz
14h40 - 15h20
Keynote Speaker: Jiang Bian

Title: The Big Short with AI in Biomedical Sciences: When Actions Don’t Follow Predictions


Big data, high-performance computing, and (deep) machine learning are increasingly becoming key to precision medicine—from identifying disease risks and taking preventive measures, to making diagnoses and personalizing treatment for individuals. Precision medicine, however, is not only about predicting risks and outcomes, but also about weighing interventions. Interventional clinical predictive models require the correct specification of cause and effect, and the calculation of so-called counterfactuals, that is, alternative scenarios.

Session 8: Graph mining
15h20 - 15h40 Use of complex networks to differentiate elderly and young people Aruane Pineda and Francisco Aparecido Rodrigues
15h40 - 16h00 Analysis of the Health Network of Metropolitan Lima against Large-scale Earthquakes Miguel Nunez-Del-Prado and John Barrera
16h00 - 16h20 Quasiquadratic time algorithms for square and pentagon counting in real-world networks Grover E.C. Guzman and Jared León
16h20 - 16h40 Identifying COVID-19 Impact on Peruvian Mental Health during Lockdown using Social Network Josimar Chire and Jimmy Oblitas