Curriculum Vitae

Education

M.S. (by research) in Computer Science

CVIT, IIIT Hyderabad (July’16 - Dec’17)

Thesis : Unconstrained Arabic & Urdu Text Recognition using Deep CNN-RNN Hybrid Networks
Advisor : Prof. C.V. Jawahar
Major : Computer Vision and Machine Learning

B.Tech. (honors) in Computer Science

IIIT Hyderabad (July’12 - April’16)

CGPA : 8.43 (out of 10)
Courses : ECE449 Artificial Neural Networks - CSE441 Database Systems - CSE565 Cloud Computing - CSE577 Machine Learning - CSE578 Computer Vision - CSE481 Optimization Methods - CSE478 Digital Image Processing - CSE471 S.M. in AI - CSE371 Artificial Intelligence - IEC239 Digital Signal Analysis - ICS211 Algorithms - IMA201 Calculas & Complex Numbers - & about 40 more.

Publications

Mohit Jain, Minesh Mathew and C.V. Jawahar, Unconstrained Scene Text and Video Text Recognition for Arabic Script, 1st International Workshop on Arabic Script Analysis and Recognition (ASAR 2017), Nancy, France, 2017. [BEST PAPER AWARD]

Mohit Jain, Minesh Mathew and C.V. Jawahar, Unconstrained OCR for Urdu using Deep CNN-RNN Hybrid Networks, 4th Asian Conference on Pattern Recognition (ACPR 2017), Nanjing, China, 2017. [STUDENT TRAVEL AWARD]

Minesh Mathew, Mohit Jain and C.V. Jawahar, Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam, 6th International Workshop on Multilingual OCR (MOCR 2017), Kyoto, Japan, 2017.

Experience

Data Science Manager

Indeed.com, Singapore, Singapore. (Nov’22 - Present)

Data Science Manager (Engineering TDM) for the Core Ranking team under Match Recommendation Platform organisation. We help people get jobs by providing the best possible Candidate-Job matches across various surfaces on Indeed, using our Recommendation Systems and Machine Learning prowess.

Manager, Data Science and Engineering

XPO Logitics, Inc., Hyderabad, India. (Oct’21 - Oct’22)

Lead the 20 member India team comprising of Data Scientists, Data Engineers, Cloud Engineers and Annotators – playing a key role in identifying and implementing multiple white space opportunities, while working closely with India Leadership to establish a new technology office in Hyderabad from the ground up.

Key Projects/Milestones

Data Science : Fraudulent Claims Prediction (XGBoost model predicting status for 15k claims/month worth $20M), Customer Geo-Fencing (DBScan based solution for 600k locations), OPS Metrics Forecasting (ARIMA, Prophet).
Data Engineering : Google Bigquery Data Warehouse (2,000 Petabyte data scan/day across 2000+ Informatica workflows serving 72 Data Lake Source systems)

Senior Data Scientist

XPO Logitics, Inc., Pune, India. (Jul’19 - Oct’21)

XPO is a top ten global provider of transportation and logistics services, with a highly integrated network of people, technology and physical assets. I’m part of the LTL (Less-Than-Truckload) division’s IT team and work on solving a large variety of industrial scale problems using computer vision, optimization and data analysis.

Key Project : DR OCR Audit solution for automatically identifying accessorial charges using Computer Vision.

Single handedly implemented the custom OpenCV pipeline module to automatically identify accessorial tick marks, which was deployed using Google Cloud Functions, PubSub and Vertex AI Airflow Pipelines.
Processes 80,000+ receipts everyday generating an additional $10M of annual revenue.

CV Research Engineer

Abzooba Infotech India Pvt. Ltd., Pune, India. (Jan’17 - Jul’19)

Abzooba Inc. is a US based data analytics and big data organisation, rated as one of the “Top Analytics Company 2012” in the world. My role here involves creating production-scale systems from bleeding-edge Computer Vision and Deep Learning algorithms. My responsibilities also include leading the Computer Vision team, handling all Computer Vision related Pre-Sales requests and functioning as the poc for all business and client teams, on-shore and off-shore.

Lead Backend & ML Developer

StartupFlux, Noida, India. (Oct’16 - Apr’17)

StartupFlux provides smart business analytics for startups and investors using Deep Learning and Machine Learning techniques. My responsibilities include leading the Back-End and ML operations at this startup, mentoring and managing the team of developers and interns working here.

Web Administrator

IIIT Hyderabad, India. (April’15 - Present)

Job requires maintaining and sustaining various online-portals and databases in-use at IIIT Hyderabad.

Research Intern

Virginia Tech, Blacksburg, U.S.A. (April’15 - March’16)

Worked on making Convolutional Neural Networks robust against adversarial perturbations under the guidance of Prof. Dhruv Batra (VT) and Prof. CV Jawahar (IIIT-H).

Summer Of Code

CloudCV, Blacksburg, U.S.A. (April’15 - Dec’15)

Work related to the creation of cloud based servers capable of carrying out computation intensive machine learning tasks with a very user friendly GUI.

Undergraduate Teching Assistant

IIIT Hyderabad, India.

Job requires teaching students of the respective courses via tutorial classes and grading coursework/exams.
TA for Computer Networks : Spring’15
TA for IT Workshop - I : Fall’14

Projects

Unconstrained Scene Text and Video Text Recognition for Arabic Script (Project Page)

People Involved : Mohit Jain, Minesh Mathew and C.V. Jawahar

Building robust recognizers for Arabic has always been challenging. We demonstrate the effectiveness of an end-to-end trainable CNN-RNN hybrid architecture in recognizing Arabic text in videos and natural scenes. We outperform previous state-of-the-art on two publicly available video text datasets - ALIF and AcTiV. For the scene text recognition task, we introduce a new Arabic scene text dataset and establish baseline results. For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data. We overcome this by synthesizing millions of Arabic text images from a large vocabulary of Arabic words and phrases.

Unconstrained OCR for Urdu using Deep CNN-RNN Hybrid Networks (Project Page)

People Involved : Mohit Jain, Minesh Mathew and C.V. Jawahar

Building robust text recognition systems for languages with cursive scripts like Urdu has always been challenging. Intricacies of the script and the absence of ample annotated data further act as adversaries to this task. We demonstrate the effectiveness of an end-to-end trainable hybrid CNN-RNN architecture in recognizing Urdu text from printed documents, typically known as Urdu OCR. The solution proposed is not bounded by any language specific lexicon with the model following a segmentation-free, sequence-tosequence transcription approach. The network transcribes a sequence of convolutional features from an input image to a sequence of target labels.