- Expertise in handling large scale structured and unstructured data.
- Efficiently handled large-scale generative AI datasets and outputs.
- Familiarity in the use of Docker tools, pipenv/conda/poetry env
- Comfort level in following Python project management best practices (use of setup.py, logging, pytests, relative module imports,sphinx docs,etc.,)
- Familiarity in use of Github (clone, fetch, pull/push,raising issues and PR, etc.,)
- High familiarity in the use of DL theory/practices in NLP applications
- Comfort level to code in Huggingface, LangChain, Chainlit, Tensorflow and/or Pytorch, Scikit-learn, Numpy and Pandas
- Comfort level to use two/more of open source NLP modules like SpaCy, TorchText, fastai.text, farm-haystack, and others
- Knowledge in fundamental text data processing (like use of regex, token/word analysis, spelling correction/noise reduction in text, segmenting noisy unfamiliar sentences/phrases at right places, deriving insights from clustering, etc.,)
- Have implemented in real-world BERT/or other transformer fine-tuned models (Seq classification, NER or QA) from data preparation, model creation and inference till deployment
- Use of GCP services like BigQuery, Cloud function, Cloud run, Cloud Build, VertexAI,
- Good working knowledge on other open source packages to benchmark and derive summary
- Experience in using GPU/CPU of cloud and on-prem infrastructures
- Design NLP/LLM/GenAI applications/products by following robust coding practices,
- Explore SoTA models/techniques so that they can be applied for automotive industry usecases
- Conduct ML experiments to train/infer models; if need be, build models that abide by memory & latency restrictions,
- Deploy REST APIs or a minimalistic UI for NLP applications using Docker and Kubernetes tools
- Showcase NLP/LLM/GenAI applications in the best way possible to users through web frameworks (Dash, Plotly, Streamlit, etc.,)
- Converge multibots into super apps using LLMs with multimodalities
- Develop agentic workflow using Autogen, Agentbuilder, langgraph
- Build modular AI/ML products that could be consumed at scale
Education: Bachelor’s in Engineering or Master’s Degree in Computer Science, Engineering, Maths or Science
Performed any modern NLP/LLM courses/open competitions is also welcomed.