Talent500

Gen AI Data Scientist [T500-17660]

Job Location

chennai, India

Job Description

Talent500 is hiring for one of our client Short Description: As a Software Engineer, you will work on developing features for the Mach1ML platform, supporting customers in model deployment using Mach1ML platform on GCP and On-prem. You will follow Rally to manage your work. You will incorporate an understanding of product functionality and customer perspective for model deployment. You will work on cutting-edge technologies such as GCP, Kubernetes, Docker, Seldon, Tekton, Airflow, Rally, etc. Description for Internal Candidates Experience in SonarQube, CICD, Tekton, terraform, GCS, GCP Looker, Google cloud build, cloud run, Vertex AI, Airflow, TensorFlow, etc., Experience in Train, Build and Deploy ML, DL Models Experience in HuggingFace, Chainlit, React Ability to understand technical, functional, non-functional, security aspects of business requirements and delivering them end-to-end. Ability to adapt quickly with open source products & tools to integrate with ML Platforms Building and deploying Models (Scikit learn, DataRobots, TensorFlow PyTorch, etc.) Developing and deploying On-Prem & Cloud environments Kubernetes, Tekton, OpenShift, Terraform, Vertex AI Experience in LLM models like PaLM, GPT4, Mistral (open-source models), Work through the complete lifecycle of Gen AI model development, from training and testing to deployment and performance monitoring. Developing and maintaining AI pipelines with multimodalities like text, image, audio etc. Have implemented real-world Chat bots or conversational agents at scale handling different data sources. Experience in developing Image generation/translation tools using any of the latent diffusion models like stable diffusion, Instruct pix2pix. Expertise in handling large scale structured and unstructured data. Efficiently handled large-scale generative AI datasets and outputs. Familiarity in the use of Docker tools, pipenv / conda / poetry env Comfort level in following Python project management best practices (use of setup.py, logging, pytests, relative module imports,sphinx docs,etc.,) Familiarity in use of Github (clone, fetch, pull/push,raising issues and PR, etc.,) High familiarity in the use of DL theory / practices in NLP applications Comfort level to code in Huggingface, LangChain, Chainlit, Tensorflow and/or Pytorch, Scikit-learn, Numpy and Pandas Comfort level to use two/more of open source NLP modules like SpaCy, TorchText, fastai.text, farm-haystack, and others Knowledge in fundamental text data processing (like use of regex, token/word analysis, spelling correction/noise reduction in text, segmenting noisy unfamiliar sentences/phrases at right places, deriving insights from clustering, etc.,) Have implemented in real-world BERT/or other transformer fine-tuned models (Seq classification, NER or QA) from data preparation, model creation and inference till deployment Use of GCP services like BigQuery, Cloud function, Cloud run, Cloud Build, VertexAI, Good working knowledge on other open source packages to benchmark and derive summary Experience in using GPU/CPU of cloud and on-prem infrastructures Skillset to leverage cloud platform for Data Engineering, Big Data and ML needs. Use of Dockers (experience in experimental docker features, docker-compose, etc.,) Familiarity with orchestration tools such as airflow, Kubeflow Experience in CI/CD, infrastructure as code tools like terraform etc. Kubernetes or any other containerization tool with experience in Helm, Argoworkflow, etc., Ability to develop APIs with compliance, ethical, secure and safe AI tools. Good UI skills to visualize and build better applications using Gradio, Dash, Streamlit, React, Django, etc., Deeper understanding of javascript, css, angular, html, etc., is a plus. Responsibilities for Internal Candidates Design NLP / LLM / GenAI applications/products by following robust coding practices, Explore SoTA models/techniques so that they can be applied for automotive industry use cases Conduct ML experiments to train/infer models; if need be, build models that abide by memory & latency restrictions, Deploy REST APIs or a minimalistic UI for NLP applications using Docker and Kubernetes tools Showcase NLP / LLM / GenAI applications in the best way possible to users through web frameworks (Dash, Plotly, Streamlit, etc.,) Converge multibots into super apps using LLMs with multimodalities Develop agentic workflow using Autogen, Agentbuilder, langgraph Build modular AI/ML products that could be consumed at scale. Qualifications for Internal Candidates Education: Bachelor’s or Master’s Degree in Computer Science, Engineering, Maths or Science Performed any modern NLP / LLM courses/open competitions is also welcomed.

Location: chennai, IN

Posted Date: 4/30/2025

View More Talent500 Jobs

Contact Information

Contact	Human Resources Talent500