Gremlin

Gremlin

Software-Entwicklung

San Jose, California 11,124 followers

The Reliability Management Platform for high-velocity engineering teams

Über uns

Gremlin’s Reliability Management Platform enables high-velocity engineering teams to standardize and automate reliability across their organizations without slowing down software delivery. Gremlin's Reliability Score sets the standard for reliability so there's no guesswork, and an automated suite of Reliability Management tools makes it easy to integrate reliability throughout the software lifecycle so there's no slowdown.

Website
http://www.gremlin.com
Industrie
Software-Entwicklung
Größe des Unternehmens
51-200 Mitarbeiter
Hauptsitz
San Jose, California
Typ
In Privatbesitz
Gegründet
2016
Spezialitäten
Distributed Systems, Resilience, Failures as a Service, DevOps, and Chaos Engineering

Standorte

Employees at Gremlin

Aktualisierungen

  • View organization page for Gremlin, graphic

    11,124 followers

    Thrilled to share more about how Upwork uses Gremlin to improve its stability and capabilities in a complex ecosystem! 🚀 "Gremlin's Chaos Engineering tools can safely and securely inject failure into systems to find weaknesses before they cause customer-facing issues. This approach is useful to experiment with specific failure patterns across infrastructure." -- Angel Boscan Read more at the link in the comments.

  • View organization page for Gremlin, graphic

    11,124 followers

    Are you tracking reliability with leading or lagging indicators? "These are our leading indicators, right? Services that have good coverage are gonna be less likely to fail because we know many of the failure modes that they might experience, they're resilient against. And this is gonna show us where you are. So where previously we only had these trailing indicators, those lagging indicators, which was like, 'Oh, here was the reliability of a service: we experienced these outages last year.' Now we can start to have leading indicators showing here's what we're doing to make our services more resilient over time." —Samuel Rossoff, Gremlin Principal Engineer

  • View organization page for Gremlin, graphic

    11,124 followers

    Are you looking at reliability as a one-time, short-term effort, or is this part of your core programs and efforts? "Building reliability is not a one time sprint. It's not something you can run like a project. You really can't assign a project manager. Whoever takes up this role or takes up the mantle for operationalizing the program, this is a regular role that they will fill until the next person starts filling it. Funding it like a project, you're gonna have a bad time. You're just going to need to fund it again. It should be part of your keep the lights on or engineering excellence budgets, just account for it and hire for it, and expect that reporting call." —Jeff Nickoloff, Gremlin Principal Engineer

  • View organization page for Gremlin, graphic

    11,124 followers

    Reliability efforts do take up some bandwidth, but in the end it's worth it—as our customers find out when their outage costs go down. "Everyone has their own priorities that they're dealing with. Given unlimited time and money, absolutely everyone would want to build the best possible system that is the most secure, performant, resilient, and everything. So yes, there's always that back and forth of the amount of bandwidth teams have available to take on resiliency, but if you have such a setup where you have enablement teams that can do some of the common tasks that make it easier for teams to adopt that has, that has worked, has worked well in our case. You have to make the value of this known and get buy in from an organization level because the investment that you're going to make here, it's not going to be a leading indicator. The indicator is going to lag a little bit, but it will show in the results of your outage costs going down." —Kaushal Dalvi, Sr. Principal Engineer, UKG

  • View organization page for Gremlin, graphic

    11,124 followers

    Are you testing for known reliability vulnerabilities? "Risks have different priorities, but ultimately we want to be aware of those risks. Just like we want our security team to go scan for known vulnerabilities, our reliability team should be scanning for known vulnerabilities. And those are easy things we should go address. There's a second part of it, which is kind of just good engineering testing, which is: Hey, we have a set of test cases that we know need to pass. What happens if we lose a dependency? What happens if we lose an availability zone? What happens if we shift over a region? Those are important test cases. Are we testing them? Do we have tests that cover them? Are we running them on a regular basis?" —Kolton Andrus, Gremlin CTO

  • Gremlin reposted this

    View profile for Josh Leslie, graphic

    CEO of Gremlin | Making applications more reliable | GTM-focused investor & advisor

    I’m in New York this week. I didn’t fly out for a work meeting, although I’ll have a few local meetings while here. I came here to see friends and family and spend time in one of my favorite places. My work schedule doesn’t change (except for time shifting) I’ve just relocated here for the week. 🏙 At Gremlin, people take vacation. We have unlimited vacation, we encourage people to take the time they need, and our company executives lead by example. But we also encourage our people to take “relocations,” and they do. Gremlin employees have worked from Mexico City, Tokyo, Portugal, an RV on the California coast, and Maui among other places. We don’t care where people work. We (mostly) don’t care when people work. We care about impact. #remotework P.S. this week was a painful reminder not to travel to NYC in August. It’s hot *and* it’s rainy?!?!

Ähnliche Seiten

Jobs durchsuchen

Finanzierung

Gremlin 3 total rounds

Letzte Runde

Serie B

US$ 18.0M

Investoren

Redpoint
Siehe mehr Informationen auf crunchbase