Program of SIMBig 2019

PROGRAM

Each long paper takes 15 minutes of presentation and 5 minutes of questions.

Each short paper and poster takes 10 minutes of presentation and 5 minutes of questions.

Wednesday 21 August
Hour	Presentation	Authors
8h00 - 8h45	Registration and reception
8h45 - 9h00	Inauguration of SIMBig 2019
9h00-10h00	Keynote Speaker: Sophia Ananiadou
9h00-10h00	Title: Text Mining for Biomedical Applications Abstract: This talk will provide an overview of recent developments in neural information extraction at the National Centre for Text Mining to support biomedical applications. Applications range from pathway construction, database curation, semantic search and systematic review development. Two systems will be presented: a) LitPathExplorer, a visual text analytics tools that integrates advanced text mining, semi-supervised learning and interactive visualization, to facilitate the exploration and analysis of pathway models and b) RobotAnalyst, a web-based screening system combining machine learning and text mining for document prioritisation.
	Session 1: Social Media and NLP
10h00 - 10h20	Spanish Sentiment Analysis with Universal Language Model Fine-Tuning: A detailed case of study	Daniel Palomino and Jose Eduardo Ochoa Luna
10h20 - 10h40	Coffee break
10h40 - 11h00	Detecting Anomalies in Time-varying Media Crime News Using Tensor Decomposition	Hugo Alatrista-Salas, Juandiego Morzan, Miguel Nunez-Del-Prado, Gustavo Yamada and Pablo Lavado Padilla
11h00 - 11h20	What can we learn from tweets? A textual analysis of tweets using #bacon as inductor word	Erick Saldaña, Dominique Valentin, Jorge Behrens, Miriam Mabel Selani, Iliani Patinho and Carmen J Contreras-Castillo
11h20 - 11h40	A fuzzy linguistic approach for stakeholder prioritization	Yasiel Pérez Vera and Anié Bermudez Peña
11h40 - 11h55	Controlling Formality and Style of Machine Translation Output Using AutoML on NMTs	Varden Wang, Aditi Viswanathan, Antonina Kononova
11h55 - 12h10	Automatic Speech Recognition of Quechua Language Using HMM Toolkit	Rodolfo Zevallos, Luis Camacho and Johanna Cordova
12h10 - 12h25	Collect Ethically: Eliminate and Reduce Bias in Twitter Datasets	Lulwah Alkulaib, Abdulaziz Alhamadani, Taoran Ji and Chang-Tien Lu

14h00 - 15h00	Keynote Speaker: Vipin Kumar
14h00 - 15h00	Title: Big Data in Climate and Earth Sciences: Challenges and Opportunities for Data Science Abstract: The climate and earth sciences have recently undergone a rapid transformation from a data-poor to a data-rich environment. In particular, massive amount of data about Earth and its environment is now continuously being generated by a large number of Earth observing satellites as well as physics-based earth system models running on large-scale computational platforms. These massive and information-rich datasets offer huge potential for understanding how the Earth's climate and ecosystem have been changing and how they are being impacted by humans actions. This talk will discuss various challenges involved in analyzing these massive data sets as well as opportunities they present for both advancing machine learning as well as the science of climate change in the context of monitoring the state of the tropical forests and surface water on a global scale.
	Session 2: Data Mining	Chair: Miguel Nuñez del Prado
15h00 - 15h20	A Place to Go: Locating Damaged Regions after Natural Disasters through Mobile Phone Data	Galo Castillo-López, María-Belén Guaranda, Fabricio Layedra and Carmen Vaca
15h20 - 15h40	Coffee break
15h40 - 16h00	Recurrence Plot Representation for Multivariate Time-series Analysis	Dennys Mallqui and Ricardo Fernandes
16h00 - 16h20	Privacy Preservation and Inference With Minimal Mobility Information	Miguel Nunez-Del-Prado and Julían Salas
16h20 - 16h40	A Progressive Formalization of Tacit Knowledge to Improve Semantic Expressiveness of Biodiversity Data	Andrea Corrêa Flôres Albuquerque and Jose Laurindo Campos Dos Santos
16h40 - 17h00	Big Data Recommender System for Encouraging Purchases in New Places Taking into Account Demographics	Miguel Nunez-Del-Prado, Hugo Alatrista-Salas, Ana Luna and Isaias Hoyos
17h00 - 17h15	Implementation of an indoor location system for mobile-based museum guidance	Dennis Núñez Fernández
18h00 - 20h00	Welcome Cocktail

Thursday 22 August
Hour	Presentation	Authors
9h00 - 10h00	Keynote Speaker: Ravi Kumar
9h00 - 10h00	Title: Crowdsourced Geodata: Some Applications
	Session 3: Machine Learning	Chair: Michelle Rodriguez / Mario Chong
10h00 - 10h20	Anomaly Detection and Levels of Automation for AI-supported system administration	Odej Kao
10h20 - 10h40	Coffee break
10h40 - 11h00	Characterization of Salinity Impact on Synthetic Floc Strength Via Nonlinear Component Analysis	Hang Yin, Patrick Carriere, Huey Lawson, Habib Mahamadian, Zhengmao Ye
11h00 - 11h20	Global Brand Perception based on SocialPrestige, Credibility and Social Responsibility:A Clustering Approach	Rosario Medina, Alvaro Talavera, Martín Hernani, Juan Lazo and José Afonso Mazzon
11h20 - 11h40	SCUT sampling technique with classification algorithms to classify child malnutrition	Juan Baraybar-Huambo and Juan Gutierrez-Cardenas
11h40 - 12h00	Come with Me Now: New Potential Consumers Identification from CompetitorsRecurrence Plot Representation for Multivariate Time-series Analysis	Miguel Nunez-Del-Prado, Hugo Alatrista-Salas and Victoria Zevallos
12h00 - 12h15	An efficient set-based algorithm for variable streaming clustering	Isaac Campos Ardiles, Jared León Malpartida and Fernando Campos Ardiles

14h00 - 15h00	Keynote Speaker: Michael Franklin
14h00 - 15h00	Title: Towards a New Discipline of Data Science Abstract: The emergence of Data Science has led to a flourishing of initiatives, centers, degrees, programs and organizational units at educational and research institutions around the world. The demand for data science know-how from students, parents, scientists and employers is strong and getting stronger. However, the interdisciplinary nature of the topic and the lack of a consensus around its definition raise challenges for its implementation in the modern university setting. Many ongoing efforts treat Data Science as simply a combination of topics from existing fields. While such an approach has obvious practical advantages, I believe that the challenges raised by Data Science imply that it should be more productively pursued as a new discipline in its own right. In this talk I will try to frame this larger question with a goal of initiating a discussion to identify the intellectual opportunities and research questions that could lie at the heart of a new discipline of Data Science.
	Session 4: Semantic Web and Knowledge Bases	Chair: Mario Chong
15h00 - 15h20	Using Embeddings to Predict Changes in Large Semantic Graphs	Damian Barsotti and Martin Ariel Dominguez
15h20 - 15h40	Coffee break
15h40 - 16h40	Keynote Speaker: Nigam Shah
15h40 - 16h40	Title: Good machine learning for better healthcare Abstract: We will discuss learnings from Stanford Medicine's Program for Artificial Intelligence (AI) in Healthcare, with the mission of bringing AI technologies to the clinic, safely, cost effectively and ethically. Using our experience in deploying predictive model to improve access to palliative care services, we will discuss potential solutions to issues relating to model correctness, interpretability, fairness, and equity as well as issues such as autonomy of decision making and fiduciary responsibility. Drawing on our experience in running a clinical consult service for generating evidence from the collective experience of patients, we will discuss the challenges as well as potential solutions to use aggregate patient data at the bedside.
	Session 5: Biomedical Informatics	Chair: Rocio Meherara
16h40 - 17h00	Sparse non-negative matrix factorization for retrieving genomes across metagenomes	Vincent Prost, Stéphane Gazut and Thomas Brüls
17h00 - 17h20	Linguistic Fingerprints of Pro-vaccination and Anti-vaccination Writings	Rebecca A Stachowicz
17h20 - 17h35	Comparing predictive machine learning algorithms in fit for work occupational health assessments	Moises Stevend Meza, Saul Charapaqui, Katherine Arapa and Horacio Chacón
17h35 - 17h50	Tensorflow for Doctors	Isha Agarwal, Rajkumar Kolakaluri, Michael Dorin and Mario Chong
17h50 - 18h05	Detection of NSCLC adenocarcinoma using supervised machine learning algorithms applicated to metabolomic profiles	Paulo Vela-Anton and Diego Rondon-Soto
19h30	Gala Diner

Friday 23 August
Hour	Presentation	Authors
	Session 6: Data-driven Software Engineering	Chair: Miguel Nuñez del Prado / Ana Luna
9h30 - 10h00	An Industry Perspective on Data-Driven Software Engineering	Ludmer Arcaya - Avantica
10h00 - 10h20	Design of cognitive tutor to diagnose the types of intelligence in students from 3 to 5 years of preschool	Flor De Maria Olivares Ramos
10h20 - 10h40	Fake News in Spanish: Towards the building of a Corpus based on Twitter	Braulio Andres Soncco Pimentel and Roxana Portugal
10h40 - 11h00	Coffee break
11h00 - 11h20	Big Data Use Case: Luca Transit	Lourdes Guiulfo (Telefónica)
11h20 - 11h40	Development of a hand gesture based control interface using Deep Learning	Dennis Núñez Fernández
11h40 - 12h00	Chronic Pain Estimation Through Deep Facial Descriptors Analysis	Manasses Antoni Mauricio Condori, Jonathan David Peña Andagua, Erwin Junger Dianderas Caut, Leonidas Mauricio Condori, Jose Carlos Díaz Rosado and Antonio Manuel Moran Cárdenas
12h00 - 12h15	Recognition of the image of a person's silhouette, based on Viola-Jones	Washington Garcia Quilachamin, Luzmila Pro Concepción and Jorge Herrera - Tapia
12h15 - 12h30	Super Resolution approach using Generative Adversarial Network models for improving Satellite Image Resolution	Ferdinand Pineda, Victor Ayma, Robert Aduviri and César Beltrán
12h30 - 12h45	Peruvian sign language recognition using a hybrid deep neural network	Yuri Vladimir Huallpa Vargas, Naysha Naydu Diaz Ccasa and Lauro Enciso Rodas
12h45 - 13h00	Cloture

PROGRAM

Wednesday 21 August

Hour

Presentation

Authors

Thursday 22 August

Hour

Presentation

Authors

Friday 23 August

Hour

Presentation

Authors