Machine Learning Applications Engineer

Research & Development

United States - NY, New York

Requisition ID



Medidata: Powering Smarter Treatments and Healthier People

Medidata, a Dassault Systèmes company, is leading the digital transformation of life sciences, creating hope for millions of people. Medidata helps generate the evidence and insights to help pharmaceutical, biotech, medical device and diagnostics companies, and academic researchers accelerate value, minimize risk, and optimize outcomes. More than one million registered users across 2,000+ customers and partners access the world's most trusted platform for clinical development, commercial, and real-world data. Known for its ground-breaking technological innovations, Medidata has supported more than 30,000 clinical trials and 9 million study participants. And Medidata’s ongoing commitment to infusing the patient voice into trial designs and solutions is helping to create a better and more inclusive experience for all participants in clinical studies. Medidata is involved in nearly 40% of company-initiated trial starts globally, with studies conducted in more than 140 countries. More than 70% of novel drugs approved by the Food and Drug Administration (FDA) in 2022 were developed with Medidata software. Medidata is headquartered in New York City and has offices around the world to meet the needs of its customers. Discover more at and follow us @medidata.

Your Mission:

Medidata is the leader in developing the technologies that allow our customers to get the medicines to patients faster.  Building on our long history of delivering world-class clinical applications to the life sciences industry, Medidata AI is staffed with a passionate team of technology and scientific experts, tackling the industry’s most difficult technical challenges in order to push the boundaries of possibility for our clients and most importantly, for patients. Together we can deliver meaningful advanced digital transformation to the industry in order to achieve our vision. 

Your Responsibilities: 

Medidata is looking for an experienced Machine Learning (ML) Engineer to help design, implement, deploy, and productionize our revolutionary data science products as part of our Medidata Platform and AI organization. 

  • Solve some of the most complex problems in healthcare, translating complicated multimodal data into meaningful insights.

  • Leverage our software engineering partners alongside our internal teams to operate and deliver in the most effective manner.

  • Build ML pipelines to support experimentation, Continuous Integration and Continuous Delivery (CI/CD), containerization, verification, validation, deployment, and monitoring of ML models.

  • Advise and collaborate closely with the architecture team, infrastructure team, data scientists, data engineers, and software engineers to operationalize ML assets, such as establishing best practices and templates to support full-lifecycle data science development and deployment.

  • Adapt the agile methodology to deliver high quality code with continuous integration.

Your Competencies: 

  • Strong verbal and written communication; collaborative focus; troubleshooting and problem solving skills

  • Solid knowledge on linux, shell scripting, and cloud computing

  • Demonstrated understanding and working knowledge of AWS Cloud technologies with preference to the following AWS certifications: Solutions Architect, Developer, DevOps Engineer, Data Analytics and/or Machine Learning

  • Understanding of common classical machine learning models and techniques (Classification, Regression, Clustering, Time-Series Analysis, etc.) to solve real world problems.

  • Familiarity with Natural Language Processing (NLP) techniques and tools, such as topic modeling, word/document embeddings, spaCy, etc. is preferred

  • Familiarity with Large Language Model (LLM) frameworks is preferred

Your Education & Experience:

  • Bachelor’s of Science required.r Master’s degree preferred in relevant field

  • 5 years of relevant work experience

  • Experience working with remote engineering teams and 3rd party contractors

  • Experience with GitHub, Git, Docker, and working with Relational Database Management Systems such as MySQL

  • Experience in statistical tools and programming languages (Python, SQL, R) that allow you to analyze complex clinical data

  • Experience with common ML and Deep Learning frameworks (Scikit-Learn, Pandas, Tensorflow, PyTorch)

  • Experience developing REST APIs in Python is preferred

  • Experience in distributed cloud-computing platforms and technologies like Kubernetes, Hadoop, and Spark ecosystems is preferred

  • Experience with Infrastructure-As-Code, terraform, AutoML, and GraphQL is preferred

The salary range posted below refers only to positions that will be physically based in New York City. As with all roles, Medidata sets ranges based on a number of factors including function, level, candidate expertise and experience, and geographic location. Pay ranges for candidates in locations other than New York City, may differ based on the local market data in that region. The base salary pay range for this position is $135,000 to $180,000.

Base pay is one part of the Total Rewards that Medidata provides to compensate and recognize employees for their work. Most sales positions are eligible for a commission on the terms of applicable plan documents, and many of Medidata’s non-sales positions are eligible for annual bonuses. Medidata believes that benefits should connect you to the support you need when it matters most and provides best-in-class benefits, including medical, dental, life and disability insurance; 401(k) matching; unlimited paid time off; and 10 paid holidays per year.

Equal Employment Opportunity

In order to provide equal employment and advancement opportunities to all individuals, employment decisions at Medidata are based on merit, qualifications and abilities. Medidata is committed to a policy of non-discrimination and equal opportunity for all employees and qualified applicants without regard to race, color, religion, gender, sex (including pregnancy, childbirth or medical or common conditions related to pregnancy or childbirth), sexual orientation, gender identity, gender expression, marital status, familial status, national origin, ancestry, age, disability, veteran status, military service, application for military service, genetic information, receipt of free medical care, or any other characteristic protected under applicable law. Medidata will make reasonable accommodations for qualified individuals with known disabilities, in accordance with applicable law.