- Advanced knowledge of Java, C, Perl, HTML
- Good knowledge of C++, C#, Python, PigLatin
- Basics of PHP, jQuery, JavaScript, CSS, Matlab
- Advanced algorithms and data structures
- Distributed computing (Apache Hadoop, Apache Pig, Apache Spark, PredictionIO)
- I know: Git, Eclipse, Sublime, Svn. (and many other tools)
- Inferring gender of a Twitter user using celebrities it follows (Java, Perl, Google API, Apache Pig)
- Quena- An automatic question answering system (Java, Apache Solr, Stanford NER, Stanford POS tagger, web crawler)
- Clustering of songs files according to mood derived from signals and lyrics (Matlab, Perl, C#)
- English to Hindi Translation and Transliteration (C, Gambas) [Natural Language Processing]
- PSLFS – Virtual File system, based on the K-Ary Data Structure with a Command Line Interface (C)
- Amazon Dynamo algorithm- replicated key-value storage (Distributed systems, Java)
- Memodiction - An Android app that lets you learn and revise words to build your vocabulary intelligently
- Tweetiments.com - Sentiment Analysis of Twitter data (Perl, PHP)
- HappyOrSad - Perl script to know mood of text and lyrics (https://github.com/puneetsl/HappyOrSad)
- ShortCleaner - A JAVA library to clean tweets for consuming it for various kinds of analytics.
- jTextBrew - A JAVA library for fuzzy string matching, based on TextBrew algorithm by Chris Brew.
- Reviewed 5 chapters of "Apache Solr Essentials" (http://goo.gl/Q7h3mW)
- "Inferring latent attributes of an Indian Twitter user using celebrities and class influencers", ACM Hypertext 2015
- "Architecture for Automated Tagging and Clustering of Song Files According to Mood", IJCSI, 2010
- "Inferring gender of a Twitter user using celebrities it follows", CSE Dept, University at Buffalo, 2014
- "PSLFS: Command Line Interpreter and Virtual File System", ICCC, 2008.
NLP Engineer, April 2015 - Present, Factset Research Systems – New York City
- Duplicate document identification (Java, Shell Scripts)
- Used Shingling and Vector Space Model algorithms to detect duplicate documents from a stream of documents.
- Statistical Machine Translation (Moses, Python)
- Building a language translation engine similar to google translate, for Polish, based on principles of statistical machine translation.
- The cleaning, tokenization, stemming and other language processing is done by using nltk and other NLP libraries in python.
- Earning and Stock Correlation using Social Media Sentiments (Python, Java, PredictionIO)
- Exploratory project to find correlation between social media sentiments with earnings and stocks of a company.
Research Engineer, July 2011 - July 2013, Innovation Labs, TATA Consultancy Services – Delhi
- Time Series Analysis on Big Data : (Technology and Tools used: Java, Python, Perl, Shell Scripts, Rapidminer)
- Worked on a research project involving multivariate time series analysis of car sensor data.
- Devised an algorithm based on Shape Context for finding frequently occurring patterns and events, with as good results as SAX, DTW etc. with 7% better results in the particular domain of car sensors.
- Worked on a research project involving multivariate time series analysis of car sensor data.
- Data Harmonization Framework (DHF) : (Technology and Tools used: Java, Apache Pig, XML)
- Implemented an ETL framework that exploits power of map-reduce and big-databases to fuse incongruous enterprise data from disparate sources in near real time.
- Implemented an ETL framework that exploits power of map-reduce and big-databases to fuse incongruous enterprise data from disparate sources in near real time.
Developer, December 2010 - June 2011,ILP training centre, TATA Consultancy Services – Trivandrum
- Trainee Evaluation System: Lead team of two, into the development of 'Trainee Evaluation System' as a part of Moodle plugin, deployed in TCS Trivandrum training centre. (PHP, MySql, jQuery)
- Resource planning and Scheduling system: Developed solution for managing infrastructure resources and scheduling lectures. (PHP, MySql)
Master of Science in Computer Science; State University of New
York, Buffalo GPA: 3.482, 2014
Bachelor of Technology in Computer Science; Jaypee Institute of information Technology, Noida,
2010
- Time Series Analysis
- Music Information Retrieval
- Sentiment Analysis
- Web Data mining
- Social Network Data Mining
"pludu [at] buffalo.edu" | +1-(716) TOP-HEGG | My personal website