Program - SIMBig 2020

PROGRAM SIMBig 2020

Each long paper takes 15 minutes of presentation and 5 minutes of questions.

Each short paper takes 10 minutes of presentation and 5 minutes of questions.

Thursday, 1st of October
Hour	Activity	Authors
8h30 - 9h30	Registration to the Zoom Platform
9h20 - 9h40	Welcoming to SIMBig 2020
9h40 - 10h00	Diagnosis of SARS-CoV-2 based on patient symptoms and diffuse classifiers	*Fray Luis Becerra Suarez, Herber Ivan Mejia Cabrera and Victor Alexci Tuesta Monteza*
10h00 - 10h40	Keynote Speaker: Ian Horrocks Title: Knowledge Graph Creation and Curation Abstract: Knowledge Graphs are increasingly important resources in science, medicine and industry, and systems for storing and querying Knowledge Graphs are becoming increasingly capable. However, creating and curating high quality knowledge is still a hard problem, and this could impede their adoption. In this talk we will consider the nature of the problem and survey some recent (and not so recent) work that attempts to address it.
	Session 1: Semantic Web and Social Networks
10h40 - 11h00	Distributed Identity Management for Semantic Entities	*Falko Schönteich, Andreas Kasten and Ansgar Scherp*
11h00 - 11h20	Telegram: Data Collection, Opportunities and Challenges	*Tuja Khaund, Muhammad Nihal Hussain, Mainuddin Shaik and Nitin Agarwal*
11h20 - 11h40	Graph theory applied to International Code of Diseases (ICD) in a hospital	*Boldorini Jr. Claudio, Carlos Euzebio, Lucas Porto, Alexandre Martinez and Evandro Ruiz*
11h40 - 12h00	CovidStream: Interactive Visualization of Emotions Evolution Associated with COVID-19	*Herwin Alayn Huillcen Baca, Flor de Luz Palomino Valdivia, Yalmar Ponce Atencio, Manuel Ibarra Cabrera, Mario Aquino Cruz and Melvin Edward Huillcen Baca*
PAUSE
14h00 - 14h40	Keynote Speaker: Dina Demner Title: Retrieving information, answering questions, and detecting misinformation about COVID-19 Abstract: One of the effects of COVID-19 pandemic is a rapidly growing and changing stream of publications to inform clinicians, researchers, policy makers, and patients about the health, socio-economic, and cultural consequences of the pandemic. Leveraging this stream of information is essential for developing policies, guidelines and strategies during the pandemic, for recovery after the COVID-19 pandemic, and for designing measures to prevent recurrence of similar threats. Managing this information stream manually is not feasible. Automated approaches are needed to quickly bring the most salient and reliable points to the readers’ attention. Leveraging the CORD-19 collection of scientific articles, data from government sites, and questions about SARS-CoV-2 asked by the researchers, clinicians and the public, in collaboration with the National Institute of Standards (NIST), Ai2 and UTHealth and OHSU researchers, we have developed datasets for retrieval of COVID-19 information and automatic question answering. These datasets allowed us to (1) conduct a community-wide evaluation of the systems’ ability to satisfy the needs for high-quality timely information about COVID-19; (2) develop deep-learning approaches to meeting information needs as they evolve during pandemics; and (3) develop approaches to detection of misinformation. This talk will present the approaches to dataset development, our systems for question answering and misinformation detection, and some preliminary results.
	Session 2: Natural Language Processing and Text mining
14h40 - 15h00	Comparative Analysis of Question Answering Models for HRI Tasks with NAO in Spanish	*Enrique Burga-Gutierrez, Bryam Vasquez-Chauca and Willy Ugarte*
15h00 - 15h20	Peruvian citizens reaction to Reactiva Perú Program: A Twitter Sentiment Analysis Approach	*Rosmery Ramos-Sandoval*
15h20 - 15h40	Twitter Early Prediction of preferences and tendencies based in neighborhood behavior	*Emanuel Meriles, Martin Ariel Dominguez and Pablo Gabriel Celayes*
15h40 - 16h00	Summarization of Twitter Events with Deep Neural Network Pre-trained Models	*Kunal Chakma, Amitava Das and Swapan Debbarma*
16h00 - 16h20	Multi-strategic author names disambiguation in bibliography repositories	*Natan de Souza Rodrigues, Aurelio Ribeiro Costa, Lucas Correa Lemos and Celia Ghedini Ralha*
16h20 - 16h40	Machine Learning techniques for speech emotion classification	*Noe Melo Locummber and Junior Fabian*
16h40 - 17h00	An evaluation of physiological public datasets for emotion recognition systems	*Alexis Mendoza, Alvaro Cuno, Wilber Ramos and Nelly Condori-Fernandez*
17h00 - 17h40	Keynote Speaker: Sophia Ananiadou Title: Text mining methods for evidence-based medicine Abstract: Text mining methods have been used for the effective and timely identification of knowledge for evidence-based medicine. Evidence-based medicine seeks to answer well-posed questions using existing results by applying a systematic and transparent methodology. The evidence, e.g., results of clinical trials, should be collected and evaluated without bias, using consistent criteria for inclusion. Literature databases such as MEDLINE provide instant access to millions of research articles, obtained from an ever-expanding list of indexed journals. This progress comes at a cost. The majority of clinicians do not possess the expert literature search skills required to develop complex search strategies, and existing databases are manually updated. I will discuss methods for automating the process of systematic reviews, by automating the process of search, screening, and data extraction methods. These text mining methods have been integrated to the system RobotAnalyst currently used by NICE.

Friday, 2nd of October
Hour	Activity	Authors
8h30 - 9h00	Registration to the Zoom Platform
9h20 - 10h00	Keynote Speaker: Maguelonne Teisseire Title: Text mining activities in the context of the MOOD project Abstract: he MOOD project (https://mood-h2020.eu) aims at harnessing the data mining and analytical techniques to the big data originating from multiple sources to improve detection, monitoring, and assessment of emerging diseases in Europe. In this presentation, I will present part of the text mining activities conducted in the TETIS lab. In particular, we will see how text analysis could help to detect weak signal for epidemic outbreak detection.
	Session 3: Machine Learning
10h00 - 10h20	YTTREX: crowdsourced analysis of Youtube's recommender system during COVID-19 pandemic	*Leonardo Sanna, Salvatore Romano, Giulia Corona and Claudio Agosti*
10h20 - 10h40	Parallel Social Spider Optimization Algorithms with Island Model for the Clustering Problem	*Edwin Alvarez-Mamani, Lauro Enciso-Rodas, Mauricio Ayala-Rincón and José Luis Soncco-Álvarez*
	Session 4: Machine Learning
10h40 - 11h00	Two-class fuzzy clustering ensemble approach based on a constraint on fuzzy memberships	*Omid Aligholipour and Mehmet Kuntalp*
11h00 - 11h20	Modeling and Predicting the Lima Stock Exchange General Index with Bayesian Networks and Information from Foreign Markets	*Daniel Chapi, Soledad Espezua Llerena, Julio Villavicencio, Oscar Enrique Miranda Castillo and Edwin Rafael Villanueva Talavera*
11h20 - 11h40	Comparative study of spatial prediction models for estimating PM2.5 concentration level in urban areas	*Irvin Vargas Campos and Edwin Villanueva*
11h40 - 12h00	Prediction of Solar Radiation using Forecasting Neural Networks	*Marcos Ponce-Jara, Alvaro Talavera, Carlos Velásquez and David Tonato*
12h00 - 12h20	COVID-19 Infection Prediction and Classification	*Souad Taleb Zouggar and Abdelkader Adla*
PAUSE
14h00 - 14h40	Keynote Speaker: Andrew Tomkins Title: Incorporating Graph Data into Machine Learning Abstract: We perform data mining tasks today over huge and growing datasets. To handle scale, we rely on a handful of highly optimized primitive operations. The current workhorse primitive in training of ML models is stochastic gradient descent, which incorporates per-instance label data efficiently into a model's internal state. However, information is often available to us, not just as labels, but also through connections between data instances, often presented as graphs, sometimes in other forms. In this talk, Andrew Tomkins will describe a number of different approaches to incorporating such higher-order data into scalable training and inference, and will suggest some open problems in this area.
	Session 5: Image processing
14h40 - 15h00	Towards a benchmark for sedimentary facies classication: Applied to the Netherlands F3 Block	*Maykol Jiampiers Campos Trinidad, Smith Washington Arauco Canchumuni and Marco Aurelio Cavalcanti Pacheco*
15h00 - 15h20	Mobile application for movement recognition in the rehabilitation of the anterior cruciate ligament of the knee	*Iam Contreras-Alcázar, Kreyh Contreras-Alcázar and Victor Cornejo-Aparicio*
15h20 - 15h40	Semantic Segmentation using Convolutional Neural Networks for Volume Estimation of Native Potatoes at High Speed	*Miguel Angel Chicchon Apaza and Ronny Michael Huerta Firma*
15h40 - 16h00	Symbiotic Trackers’ Ensemble with Trackers’ Re-initialization for Face Tracking	*Victor Ayma, Patrick Happ, Raul Feitosa, Gilson Costa and Bruno Feijó*
16h00 - 16h20	Multi-class Vehicle Detection and Automatic License Plate Recognition based on YOLO in Latin American Context	*Pedro Montenegro, Jhonatan Camasca and Junior Fabian*
16h20 - 16h40	Static Summarization using Pearson’s Coefficient and Transfer Learning for Anomaly detection for Surveillance Videos	*Steve Chancolla-Neira, Cesar Salinas-Lozano and Willy Ugarte*
16h40 - 17h00	Humpback Whale’s Flukes Segmentation Algorithms	*Andrea Jackeline Castro Cabanillas and Victor Hugo Ayma Quirita*
17h00 - 17h20	Improving Context-Aware Music Recommender Systems with a Dual Recurrent Neural Network	*Igor Santana and Marcos Domingues*

Saturday, 3rd of October
Hour	Activity	Authors
9h00 - 9h20	Registration to the Zoom Platform
9h20 - 10h00	Keynote Speaker: Francisco Rodrigues Title: Predicting dynamical processes in complex networks: a machine learning approach Abstract: One of the most fundamental problems in Network Science is to understand how dynamical processes are influenced by the network organization. For instance, if we can understand how patterns of connections between coupled oscillators influence the evolution of the synchronous state, then we can change the network topology to control the level of synchronization of power grids and electronic circuits. In this talk, we will discuss how machine learning methods can be used to predict disease propagation in networks and the state of coupled oscillators. The current challenges in Network Science and some possible ideas for future research will also be discussed in our talk.
	Session 6: Social Networks
10h00 - 10h20	Classification of Cybercrime Indicators in Open Social Data	*Ihsan Ullah, Caoilfhionn Lane, Teodora Buda, Brett Drury, Marc Mellotte, Haytham Assem and Michael Madden*
10h20 - 10h40	StrCoBSP: relationship Strength-aware Community-based Social Profiling	*Asma Chader, Hamid Haddadou, Leila Hamdad and Walid-Khaled Hidouci*
10h40 - 11h00	Identifying Differentiating Factors for Cyberbullying in Vine and Instagram	*Rahat Ibn Rafiq, Homa Hosseinmardi, Richard Han, Qin Lv and Shivakant Mishra*
11h00 - 11h20	Effect of Social Algorithms on Media Source Publishers in Social Media Ecosystems	*Ittipon Rassameeroj and S. Felix Wu*
11h20 - 11h40	The Identification of Framing Language in Business Leaders' Speech From the Mass Media	*Brett Drury and Samuel Morais Drury*
11h40 - 12h20	Clustering Analysis of Website Usage on Twitter during the COVID-19 Pandemic	*Iain Cruickshank and Kathleen Carley*
PAUSE
	Session 7: Data-driven software engineering
14h00 - 14h20	Calibrated Viewability Prediction for Premium Inventory Expansion	*Jonathan Schler and Allon Hammer*
14h20 - 14h40	Data Driven Policy Making: The Peruvian Water Resources Observatory	*Giuliana Barnuevo Reategui, Elsa Galarza, Maria Paz Herrera, Juan G. Lazo Lazo, Miguel Nunez-Del-Prado and Jose Luis Ruiz*
14h40 - 15h20	Keynote Speaker: Jiang Bian Title: The Big Short with AI in Biomedical Sciences: When Actions Don’t Follow Predictions Abstract: Big data, high-performance computing, and (deep) machine learning are increasingly becoming key to precision medicine—from identifying disease risks and taking preventive measures, to making diagnoses and personalizing treatment for individuals. Precision medicine, however, is not only about predicting risks and outcomes, but also about weighing interventions. Interventional clinical predictive models require the correct specification of cause and effect, and the calculation of so-called counterfactuals, that is, alternative scenarios.
	Session 8: Graph mining
15h20 - 15h40	Use of complex networks to differentiate elderly and young people	*Aruane Pineda and Francisco Aparecido Rodrigues*
15h40 - 16h00	Analysis of the Health Network of Metropolitan Lima against Large-scale Earthquakes	*Miguel Nunez-Del-Prado and John Barrera*
16h00 - 16h20	Quasiquadratic time algorithms for square and pentagon counting in real-world networks	*Grover E.C. Guzman and Jared León*
16h20 - 16h40	Identifying COVID-19 Impact on Peruvian Mental Health during Lockdown using Social Network	*Josimar Chire and Jimmy Oblitas*

PROGRAM SIMBig 2020

Thursday, 1st of October

Hour

Activity

Authors

Keynote Speaker: Ian Horrocks

Keynote Speaker: Dina Demner

Keynote Speaker: Sophia Ananiadou

Friday, 2nd of October

Hour

Activity

Authors

Keynote Speaker: Maguelonne Teisseire

Keynote Speaker: Andrew Tomkins

Saturday, 3rd of October

Hour

Activity

Authors

Keynote Speaker: Francisco Rodrigues

Keynote Speaker: Jiang Bian