TOM AND SPIKE CLASSIFIER
An implementation of Tensorflow Object Detection API
This project is a hands-on tutorial for anyone who aspires to learn object detection. To keep things simple, I chose to implement this using Google's Tensorflow Object Detection API. I hosted a meetup for this project and around 70 people attended the session. I decided to choose a dataset that would interest everyone - The Tom and Jerry Show, because why not! The session was recorded and the lecture is posted on Youtube. I also wrote a series of four blogs for this project. You can find the blog here.
AUTO DEEP LEARNING
An attempt to make Deep Learning easier for everyone
This project makes deep learning easier for everyone. Users, both with and without ML knowledge, can train small neural networks for their data(currently only CSV data is supported).
The user needs to just upload their CSV and enter the type of task(regression or classification), along with the name of the target column. The system would then select the best features, scale them according to their data type, decide the no. of layers to be used, decide the activation and loss functions, then apply cross-validation and store the results in an appropriate location.
BITCOIN PRICE PREDICTION USING MULTIPLE REGRESSION TECHNIQUES
An attempt to facilitate Cryptocurrency trading for traders
The goal of this project was to determine just how well commonly used regression techniques performed in the context of bitcoin price prediction. We compared the following Regression techniques:
Lasso, Ridge, Decision Trees, Light GBM, XG Boost and SVM Regressors.
For our analysis, we utilized minute level bitcoin pricing data, originally scraped from the crypto-currency exchange BitFinex. This data consisted of 2,638,113 observations, spanning from 4/01/2013 at 06:56:00 to 2/07/2020 at 18:22:00. In order to analyze the performance of various regression models, we needed to devise a scheme for training each regression model. While methods such as cross-validation and shuffled splitting are commonly used in this regard, such methods are not applicable to time series data due to the inherent assumption that observations in the data are independent of one another, which is false in the case of financial time series.
Thus, we sought to compare both the effectiveness and the cost of offline regression models trained on some past fixed portion of data versus online learning models, trained solely on some fixed number of previous observations n .
COREFERENCE RESOLUTION SYSTEM
A model that finds all the references that belong to a given Coreference cluster in multiple documents
The goal of this project is to find all the occurrences of words that belong to a given cluster head, in multiple documents. Coreference resolution is still a research area in NLP, so we tried multiple techniques to achieve this.
We implemented string matching for words and noun phrases, word embedding similarity, word semantics by Spacy, and Hobb's algorithm for making clusters for pronouns and noun phrases separately.
RANDOM PROJECTS USING RESERVOIR COMPUTING
A model that makes non-linear data linearly separable using a reservoir
This project is a niche ML system that makes non-linear data linearly separable in a higher dimension using Crosstalk between two current-carrying wires as a reservoir, by using the principles of Reservoir Computing.
A reservoir computing system consists of a reservoir for mapping inputs into a high-dimensional space and a readout for pattern analysis from the high-dimensional states in the reservoir. The major advantage of reservoir computing compared to other recurrent neural networks is fast learning, resulting in low training costs.
Crosstalk is any phenomenon by which a signal transmitted on one circuit or channel of a transmission system creates an undesired eﬀect in another circuit or channel and is usually caused by undesired capacitive, inductive, or conductive coupling from one circuit or channel to another. This phenomenon is undesired in electronics. But we are trying to use this phenomenon as a ’Physical Reservoir’ to project our data into higher dimension.