M.S. (by research) in Computer Science
CVIT, IIIT Hyderabad (July’16 - Dec’17)
- Thesis : Unconstrained Arabic & Urdu Text Recognition using Deep CNN-RNN Hybrid Networks
- Advisor : Prof. C.V. Jawahar
- Major : Computer Vision and Machine Learning
B.Tech. (honors) in Computer Science
IIIT Hyderabad (July’12 - April’16)
- CGPA : 8.43 (out of 10)
- Courses : ECE449 Artificial Neural Networks - CSE441 Database Systems - CSE565 Cloud Computing - CSE577 Machine Learning - CSE578 Computer Vision - CSE481 Optimization Methods - CSE478 Digital Image Processing - CSE471 S.M. in AI - CSE371 Artificial Intelligence - IEC239 Digital Signal Analysis - ICS211 Algorithms - IMA201 Calculas & Complex Numbers - & about 40 more.
Mohit Jain, Minesh Mathew and C.V. Jawahar, Unconstrained Scene Text and Video Text Recognition for Arabic Script, 1st International Workshop on Arabic Script Analysis and Recognition (ASAR 2017), Nancy, France, 2017. [BEST PAPER AWARD]
Mohit Jain, Minesh Mathew and C.V. Jawahar, Unconstrained OCR for Urdu using Deep CNN-RNN Hybrid Networks, 4th Asian Conference on Pattern Recognition (ACPR 2017), Nanjing, China, 2017. [STUDENT TRAVEL AWARD]
Minesh Mathew, Mohit Jain and C.V. Jawahar, Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam, 6th International Workshop on Multilingual OCR (MOCR 2017), Kyoto, Japan, 2017.
XPO Logitics, Inc., Pune, India. (Jul’19 - Present)
- Work as a Data Scientist with Computer Vision expertise for the Less-Than-Truckload division of XPO Logistics, Inc.
CV Research Engineer
Abzooba Infotech India Pvt. Ltd., Pune, India. (Jan’16 - Jul’19)
- Abzooba Inc. is a US based data analytics and big data organisation, rated as one of the “Top Analytics Company 2012” in the world. My role here involves creating production-scale systems from bleeding-edge Computer Vision and Deep Learning algorithms. My responsibilities also include leading the Computer Vision team, handling all Computer Vision related Pre-Sales requests and functioning as the poc for all business and client teams, on-shore and off-shore.
Lead Backend & ML Developer
StartupFlux, Noida, India. (Oct’16 - Apr’17)
- StartupFlux provides smart business analytics for startups and investors using Deep Learning and Machine Learning techniques. My responsibilities include leading the Back-End and ML operations at this startup, mentoring and managing the team of developers and interns working here.
IIIT Hyderabad, India. (April’15 - Present)
- Job requires maintaining and sustaining various online-portals and databases in-use at IIIT Hyderabad.
Virginia Tech, Blacksburg, U.S.A. (April’15 - March’16)
- Worked on making Convolutional Neural Networks robust against adversarial perturbations under the guidance of Prof. Dhruv Batra (VT) and Prof. CV Jawahar (IIIT-H).
Summer Of Code
CloudCV, Blacksburg, U.S.A. (April’15 - Dec’15)
- Work related to the creation of cloud based servers capable of carrying out computation intensive machine learning tasks with a very user friendly GUI.
Undergraduate Teching Assistant
IIIT Hyderabad, India.
- Job requires teaching students of the respective courses via tutorial classes and grading coursework/exams.
- TA for Computer Networks : Spring’15
- TA for IT Workshop - I : Fall’14
Unconstrained Scene Text and Video Text Recognition for Arabic Script (Project Page)
- Building robust recognizers for Arabic has always been challenging. We demonstrate the effectiveness of an end-to-end trainable CNN-RNN hybrid architecture in recognizing Arabic text in videos and natural scenes. We outperform previous state-of-the-art on two publicly available video text datasets - ALIF and AcTiV. For the scene text recognition task, we introduce a new Arabic scene text dataset and establish baseline results. For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data. We overcome this by synthesizing millions of Arabic text images from a large vocabulary of Arabic words and phrases.
Unconstrained OCR for Urdu using Deep CNN-RNN Hybrid Networks (Project Page)
- Building robust text recognition systems for languages with cursive scripts like Urdu has always been challenging. Intricacies of the script and the absence of ample annotated data further act as adversaries to this task. We demonstrate the effectiveness of an end-to-end trainable hybrid CNN-RNN architecture in recognizing Urdu text from printed documents, typically known as Urdu OCR. The solution proposed is not bounded by any language specific lexicon with the model following a segmentation-free, sequence-tosequence transcription approach. The network transcribes a sequence of convolutional features from an input image to a sequence of target labels.