Sreejith Menon


About Me

I am a graduate student majoring in Computer Science from University of Illinois at Chicago. I am currently working as a Graduate Research Assistant at Computational Population Biology lab under Prof. Tanya Berger-Wolf.
I am passionate about Machine Learning, Data Mining and Statistical Analysis of large datasets. My go-to language is Python and the majority of the work I do is using Python3 and Bash Shell Scripting. I have been involved with Data Sciences for about 4 years now. I have been involved in Data Warehousing for Point-of-Sale and Data Analytics for Fraud monitoring during my tenure as an Application Developer at J. P. Morgan Chase & Co. After joining UIC, my focus has shifted to Data Mining and Machine Learning giving me an all round exposure to various aspects of Data Science.
My current area of research involves building a population estimation model using pictures that are mined from publically shared albums in different social media platforms. The problem of population estimation using social media images is a non-trivial problem due to the inherent complexities introduced by biases in social media data. We attempt to quantify and understand the biases that influence the sharing behavior of human with respect to wildlife images. My advisors for this research project are Prof. Tanya Berger-Wolf, UIC and Emre Kiciman, Microsoft Research.

Sreejith Menon

Sreejith Menon

Academic Background

Graduate School:

Master of Science in Computer Science
University of Illinois at Chicago
August 2015 - Present

Undergraduate School:

Bachelor of Engineering in Computer Engineering
Ramrao Adik Institute of Technology
August 2009 - May 2013


Thesis

    Animal Wildlife Estimator using Social Media

    Duration: October 2015 - Present

    Description: AWESOME is a research project initiative to build a population estimator using pictures from social media. Tracking of wildlife population using conventional methods incurs a financial as well as an operational burden. AWESOME tries to solve this problem by turning to a promising and opportunistic form of citizen science. We are trying to build a population estimator by mining images of animals from social media. Estimating population from social media images is a non-trivial problem due to the inherent complexities introduced by biases in social media data. We attempt to quantify and understand the biases that influence the sharing behavior of human with respect to wildlife images. The first step towards building such an estimator that accounts from such self-reporting bias is to build a classifier that can learn whether or not a picture of a certain wildlife species will be shared or not. Get Project Details Here.

    Languages/Technologies: Python3, Pandas Dataframes, Folium, Plot.ly, sklearn, sklearn-metrics, Amazon Mechanical Turk(AWS), Bash Shell, Windows batch

    GitHub

My Projects

  1. Santander Customer Satisfaction
  2. Duration: March 2016 - May 2016

    Description: The project is about identifying unsatisfied customers of Santander. The dataset had approximately 570 anonymized attributes and about 45 different attributes were selected by attribute selection techniques. Attribute selection was done by calculating information gain for every parameter and parameters with information gain greater than average information gain over all parameters were selected for the learning algorithm. Decision trees and nueral network were trained using the training data and decision tree model achieved over 98% accuracy with 0.96 area under the ROC curve. Get Project Details Here.

    Languages/Technologies: Python3, Pandas Dataframes, Scikitlearn

    GitHub

  3. Spatio Temporal Resource Search - Parking Spot
  4. Duration: February 2016 - April 2016

    Description: Implementation of real time, probabilistic and baseline greedy approaches to find parking spots and route the user of the system to a parking location. Implementation based on Deterministic Magnitude GRA (DM-GRA) described in detail in the paper “Spatio-temporal Matching Algorithms for Road Networks” - Daniel Ayala, Ouri Wolfson, Bo Xu, Bhaskar DasGupta. Implementation of advanced parking search techniques boosted performance (in terms of total time saved per ride on average) by 34% compared to the baseline approach.

    Languages/Technologies: Python3, Pandas Dataframes, MySQL, HTML/Java Script(minimalistic front-end)

    GitHub

  5. Bayesian Medical Expert using NETICA
  6. Duration: February 2016 - March 2016

    Description: Bayesian Medical Expert system is a probabilistic inference system that assesses the probability of a patient having a heart related disease and diabetes. The system takes into account numerous symptoms into consideration and build a Bayesian Network on the basis of that. The symptoms for heart diseases and diabetes are categorized into three main parts viz. less critical symptom, medium critical symptoms and very critical symptoms. The conditional probability tables of these are generated by individual root node symptoms that can be directly observed by a physician. Many of the symptoms are qualitative and it is not possible to quantify it because of which existence of a probabilistic system is highly beneficial. The relationships between various stages can be clearly represented by the Bayesian Network.

    Languages/Technologies: Python3, NETICA

    GitHub

  7. Personal Fitness Fuzzy Expert System using FuzzyJ
  8. Duration: February 2016

    Description: Fitness Star is a fuzzy based expert system built on FuzzyJ that is designed to take certain health related parameters from the user and infer certain vital parameters and makes recommendations to the user of the system. The system is a prototype which classifies the user as underweight, normal or overweight. The system also diagnoses diabetes, chances of coronary heart diseases, recommends workout and also the stress levels.

    Languages/Technologies: Java Expert System Shell with Fuzzy extension (FuzzyJ)

    GitHub

  9. Personal Fitness Expert System using JESS
  10. Duration: January 2016 - February 2016

    Description: Fitness Star is a rule based expert system built on JESS that is designed to take certain health related parameters from the user and infer certain vital parameters and makes recommendations to the user of the system. The system advises user his/her workout and food intake plans based on the person’s age, gender, weight, and height and resting heart rate. The system is also capable of diagnosing chances of diabetes and other coronary heart conditions based on blood pressure levels, sugar levels and sleep patterns.

    Languages/Technologies: Java Expert System Shell

    GitHub

  11. Ad classifier for HTML pages using logisitic regression
  12. Duration: October 2015 - December 2015

    Description: Designed and proposed multiple machine learning techniques for image classification. Imputed missing data using k-means clustering algorithm. Implemented a learning algorithm in Python using logistic regression. Achieved an overall 10 fold cross-validation accuracy of 98%.

    Languages/Technologies: Python3, Pandas Dataframes, Scikitlearn

    GitHub


Professional Experience

    University of Illinois at Chicago

    Graduate Research Assistant
    May 2016 – Present


  • Performed research to identify a predictive population estimation model using social media photos under the collective advice of Prof. Tanya Berger-Wolf (UIC) and Emre Kiciman (Microsoft Research).
  • Mentored 3 undergraduate students from UIC in a project to infer if stripe-patterns in zebras are inherited.


  • University of Illinois at Chicago

    Graduate Teaching Assistant
    January 2016 – May 2016


  • Facilitated lab sessions and review sessions for Introduction to Computing and Programming course under the department of Computer Science.
  • Coordinated with Prof. Douglas Hogan in evaluating and proctoring examinations.


  • J. P. Morgan Chase & Co.

    Technology Analyst
    July 2013 – July 2015


  • Built high performance real-time data-warehousing solutions using Abinitio in conjunction with Bash scripting.
  • Pioneered in remodeling a load strategy that conserved over 2000 CPU and 100 human hours per month.
  • Awarded the title of Subject Matter Expert for enabling the team to achieve tight service level agreements by formulating solutions that required a deep understanding of both functional and technical aspects of the project.


Contact Details


Email ID: smenon8@uic.edu

LinkedIn

GitHub

Flickr

Facebook