I recently had the privilege of attending SREcon23 Americas. This conference is a gathering of Site Reliability Engineering (SRE) professionals who share their experiences with implementing and improving SRE services. USENIX, The Advanced Computing Systems Association, organizes this annual conference. SREcon 2023 was held in beautiful Santa Clara, CA, and attendees came from all over the globe.
SREcon23 Americas Sessions
The opening sessions included “The Endgame of SRE” from Amy Tobey, Equinix, and “SRE’s Critical Role in the COVID-19 Pandemic Response in Government” from Amy Quispe, U.S. Digital Service, Alum; Marc Alvidrez, U.S. Digital Service; Rick Hawes, U.S. Digital Service, CDC. Both sessions highlighted the critical role that SRE plays within an organization.
There were many other sessions throughout the three days of the conference, with speakers from many organizations heavily involved in SRE practices, such as J.P. Morgan Chase, LinkedIn, Morgan Stanley, Netflix, Bloomberg, DBS Bank, Spotify, and many others.
Some of my favorite sessions were:
- Confessions of an SRE Manager
- Exploring Disconnects between Reliability Practitioners and Management/Executives
- Implementing SRE in a Regulated Environment
- Financial Resiliency Engineering: Taming Cloud Costs
What SREcon is All About
First and foremost, SREcon is about folks who are passionate about SRE and want to learn and share their experiences.
There is significant research within the Site Reliability Engineering field, and I got to meet researchers who often publish papers on this subject. Additionally, SREcon also highlights research grants and grant recipients.
Important Take Aways from SREcon23 Americas
Site Reliability Engineering is not platform engineering, observability, or DevOps. SRE is its own discipline, even though many people interchangeably use this terminology.
There are many Observability tools out there, with many of the vendors exhibiting or presenting at SREcon23 Americas, including Datadog, Splunk, Observe, New Relic, PagerDuty, Dataset, Dynatrace, and Grafana. You have to dig deeper into which features you’re looking for in the Observability tool to find the right fit.
Cloud has changed SRE teams’ approaches as you adopt managed services and PaaS services, as often service level reliability is built and offered by the cloud provider. Now it has become more of a shared responsibility model. You take advantage of the out-of-the-box reliability features that the cloud provider has, and then build your own reliability layer for components and scenarios that the provider does not cover. For example, in an Availability Zone failure scenario, the customer is responsible for building their own reliability layer.
Even though we were a large group, we communicated very effectively on Slack. Some folks launched new virtual groups through Slack and then later met in-person during the evenings for working and brainstorming sessions.
Over the three days of SREcon, I had many interactions and passionately talked about SRE with other attendees. We came together as a community, and I look forward to talking more with the new friends I met at SREcon23 Americas.
Interested in learning more about site reliability engineering and how SRE services can help you achieve your organization’s reliability goals?
Contact us to continue the conversation around SRE.
How to Solve the Oracle Error ORA-12154: TNS:could not resolve the connect identifier specified
The “ORA-12154: TNS Oracle error message is very common for database administrators. Learn how to diagnose & resolve this common issue here today.
How to Recover a Table from an Oracle 12c RMAN Backup
Our database experts explain how to recover and restore a table from an Oracle 12c RMAN Backup with this step-by-step blog. Read more.
Oracle Database Extended Support Deadlines: What You Need to Know
If you’re confused about Oracle’s extended support deadlines, you are not alone. Here’s an overview of what’s in store for 11g through 19c.