Data Science Engineer

Software Development

United States - NY, New York

Requisition ID



Medidata: Power Smarter Treatments and Healthier People

Medidata is leading the digital transformation of life sciences, creating hope for millions of patients. Medidata helps generate the evidence and insights to help pharmaceutical, biotech, medical device and diagnostics companies, and academic researchers accelerate value, minimize risk, and optimize outcomes. More than one million registered users across 1,900+ customers and partners access the world's most trusted platform for clinical development, commercial, and real-world data. Medidata, a Dassault Systèmes company, is headquartered in New York City and has offices around the world to meet the needs of its customers. Discover more at and follow us @medidata.


Our Team:  Our team is responsible for Medidata’s data platform. Our customers depend on our ability to process and analyze large quantities of data in real-time. We focus on solving complex real-world problems while shipping practical solutions. What we do is as much art as it is science, and as Steve Job put it, “Real artists ship.”

Who we’re looking for:

Someone that can work closely with data scientists, data engineers and product leadership to design schemas and build database objects consumed by Medidata’s data products. This involves developing end to end pipelines for data ingestion and curation. We value delivering solutions that serve immediate needs while allowing us to iterate towards the ideal state in a complex changing data landscape.

  • Solve some of the most complex problems in healthcare, translating complex data into meaningful insights
  • Design, develop and validate statistical models for novel medical applications. Areas of team focus include Clinical Trial analytics
  • Evaluate and assess novel tools, algorithms, and technologies that enable data science capabilities
  • Provide support functions around model-building, including data cleaning and code review
  • Ability to understand and peer-review complex, multivariable statistical models and data analytics solutions using machine learning algorithms
  • Ability to work independently on complex and diverse issues and propose intelligent solutions
  • Bring to production developed methods and code for integration with existing/new products


Requirements (Education & Experience):

  • B.S, M.S, or Ph.D. in Computer Science, Physics, Engineering or another quantitative field with a strong foundation in computer science
  • 1-3 years of relevant experience
  • Experience in query design on large, unstandardized, complex datasets (SQL, sparql, etc)
  • Experience with both relational and non-relational databases.
  • Preferred experience with large healthcare datasets