Infobahn Softworld Inc

Cluster Engineer GPU+CPU @ REMOTE

Best Contact Number

We have an immediate opportunity with one of our direct clients. Please find the job description below and if you are interested, please forward your resume and share below details:

Work Authorization

Hourly Payrate expected (W2):

Month & Day Of Birth (MM/DD)

Present Location & Zip-code:

LinkedIn

Job Title: Cluster Engineer

Location: Hillsboro, OR

Duration: 6 Months Contract

100% remote

Laptop provided

High School Diploma / GED Is Required

Client is looking for an experienced cluster administrator to manage our Client Labs academic research cluster with 200+ compute nodes. The right candidate will have experience with popular HPC technologies, be able to work well independently for periods of time, and be interested in learning quickly by being challenged to solve complex problems.

Responsibilities

  • Serve as the cluster administrator for Intel Labs academic research cluster
  • Driving deployments of new servers of varying types depending on need. Likely to involve engagement with product teams to help perform best known methods and configuration.
  • Technical escalation point for first level support technicians likely to include topics involving job submissions, resource requests, troubleshooting, etc.
  • Monitor and report utilization of cluster resources including compute nodes and NAS storage leveraging existing tools such as Zabbix and potential usage of Grafana + Prometheus.
  • Serve as the owner of the SLURM job scheduler, defining the configuration that best fits the user base and developing/enabling advanced features as applicable.
  • Maintain software stack including Intel OneAPI, OpenHPC, and one off package install requests across compute nodes utilizing existing Chef configuration management tool.
  • Educate users who have varying levels of experience how to best utilize the cluster when they reach out with questions. Regularly update the documentation provided to users with the most relevant details to improve their experience.

Desired Skills

  • Good communication skills. You can effectively communicate with a variety of stakeholders, including presenting plans to higher management and having technical discussions with engineers/scientists.
  • Experience designing and managing large clusters with heterogeneous HW (CPUs, GPUs, etc.)
  • User-centric and results oriented. You can learn from data what the needs of our scientists/engineers will and can produce a cluster growth plan to fulfill these needs
  • Power user. You are willing to extensively test the different workflows that run in the cluster and help optimize them.
  • Cluster tech stack. You are an expert on cluster orchestration and management, familiar with technologies such as SLURM, Docker, Zabbix, Chef, etc. (or you are willing to learn them quickly)
  • Seniority level

    Entry level
  • Employment type

    Contract
  • Job function

    Engineering and Information Technology
  • Industries

    IT Services and IT Consulting

Referrals increase your chances of interviewing at Infobahn Softworld Inc by 2x

See who you know

Get notified about new Software Engineer jobs in Hillsboro, OR.

Sign in to create job alert

Similar Searches

Looking for a job?

Visit the Career Advice Hub to see tips on interviewing and resume writing.

View Career Advice Hub