Death of the DBA… Again…

If you’ve followed the blog for a while you will know I’ve revisited this topic several times since I first wrote about it in 2007. Here are some links if you want to go down memory lane.

I think I’ve been pretty consistent with my opinion on this subject, but I still get people misunderstanding the argument, so I’m going to try again, without revisiting the contents of all those previous articles.

What is a DBA?

This means something different in each company and each person I speak to.

In some companies it can mean a basic operations job. Install, patch, check backups and run scripts that other people send to you. All of these tasks are easily automated on-prem, and essentially don’t exist on the cloud. If this is the role you do, and this is all you do, you are going to have a bad time unless you gain some new skills. In other companies it can mean something completely different. My official title is “Lead DBA”. What do I do? Just off the top of my head:

  • DBA work with Oracle, SQL Server, MySQL.
  • Administration of middle tier stuff like WebLogic, Tomcat, NGINX etc.
  • I look after the load balancer configuration for a big chunk of the back-end business systems, including writing iRules in TCL.
  • Support for a few proprietary 3rd party apps.
  • Docker/containers.
  • APEX and ORDS. I’m the worlds worst APEX developer (see here), but I have to look after it, and support the APEX developers.
  • I don’t do much traditional development in this company, but I provide support when people have SQL and PL/SQL questions, because I’ve done that for a minute. πŸ™‚
  • I’m increasingly being drawn into automation using shell scripts, Ansible, Terraform, Liquibase etc.

I’ve already got rid of some of the operational aspects of my job. Over time I’m hoping more will go, but that mostly depends on external constraints holding me back. Even if my involvement with databases stopped completely, I can still remain busy. Am I saying my role is what a DBA should be? No. I think my position is a little more extreme than some. I’m saying DBA is a title that various people and companies use, but it doesn’t really mean anything now. It can be anything from basic operations to “Do Bloody Anything”.

When is it going to happen?

For some companies it already has. Some companies will hang on to the old ways until the bitter end. It won’t happen overnight, but if it is not happening already in your company, what you are likely to see is a gradual drop in operational tasks as they get automated. This allows either the same number of people to cover more systems, or less people to do all the current work.

If you are seeing pieces of your role disappearing, you have to do something else to add value!

But person X disagrees with you!

That’s fine. This is only my opinion and you don’t have to agree, but please check the context of what people are saying. Often the responses to my comments include mentions of performance tuning and diagnosing architectural issues. I have consistently said the “operations DBA” role would go. The job that focuses on the basic grunt work. There will be a continued need for people who can diagnose and solve performance problems, both in the databases and in the application architecture. Moving to the cloud won’t magically cure all performance issues, and some would say they will increase the number of issues. You can still deliver architecturally crap solutions on the cloud. You can still do bad database design and write bad applications. Good people will always find a home, provided they don’t stick rigidly to the past.

You also have to look at the person making the comments. If someone is a performance consultant to the stars, but identifies as a DBA, they are probably going to take these comments personally and hear me saying they are redundant, which I am not. If someone runs a DBA as a service company, they won’t like the sound of this and will go into defensive mode.

I’ve been a regular DBA for most of my working life, and I’ve watched the job evolve. You may have a different experience, and that is fine, but I speak to a lot of people, and I think my opinion on this subject tracks pretty well with what is happening out there.

You are really talking about the cloud aren’t you?

Not. Automation is the thing that is removing the need for the operational DBA and basic system admin roles. Even if your company is in cloud denial, it doesn’t mean they won’t want to take advantage of automation. The cloud makes some things easier from an automation perspective, because the cloud providers have done some of the leg-work for you, but automation existed long before the cloud…

What should I do?

When you know, please let me know so I can do it too. πŸ™‚ Seriously though, keep your mind open to new opportunities, and if you get a chance to try something new, give it a go. Nothing is ever wasted. Some people will gravitate to the data and analytics side of things. Some to development. Some to the architectural end of things. In all cases, there is a lot to learn and the less you know when you start, the harder the journey will be, so get learning whatever you like the look of…

Cheers

Tim…

Update 1: In social media comments, some people have mentioned the term “Data Engineer”. To me a data engineer needs to understand data, and requires and understanding of the development process. I’ve met some operational DBAs that can barely string together a SQL statement. These types of operational DBAs have a lot to learn before they can consider themselves a data engineer. A DBA can become a data engineer, but being a DBA does not mean you are a data engineer.

Update 2: Don’t get distracted by the name DBA. I don’t care about the name of the role. I’m talking about the function the person is performing. If they are performing basic operational tasks, whatever they are called, they need to start getting some new skills.

AZ-900 Microsoft Azure Fundamentals (my thoughts)

I recently had to sit the AZ-900 Microsoft Azure Fundamentals exam for work. Here are some thoughts about it.

Preparation

There are lots of ways to prepare for this exam. Here are the things I used…

  • Microsoft Certified: Azure Fundamentals – Learn : My initial plan was to just work through these notes and sit the exam. After I had completed two of the six modules I was going crazy. I found the material really dull. Even though I switched to using video tutorials, I still came back to this material, and used the knowledge check questions. There are also some things in this material that was not covered elsewhere.
  • AZ-900 | Microsoft Azure Fundamentals Full Course, Free Practice Tests, Website and Study Guides : One of my colleagues suggested this YouTube playlist. It made the process of covering the material a lot easier. The guy who put this together included linked materials, which was handy. I used the free practice questions he provided to check my knowledge. This playlist was my main preparation.
  • Microsoft Azure Fundamentals Certification Course (AZ-900) – Pass the exam in 3 hours! : Someone on Twitter recommended this video. I only watched about 20 minutes of it, but it seemed OK. A number of people said it was good. You might want to give it a go.
  • Microsoft Virtual Training Days : This is a Microsoft provided 4 hour training session, spread over two days. It’s free, and if you attend it you get an exam voucher for free. They claim it is all you need to pass the exam, but I don’t think that is true. I did all my prep using the YouTube playlist mentioned previously, and attended this course as a revision session before sitting the exam. It is pre-recorded material, but there are people available to answer questions during the course. There are 2-3 sessions per month, so it’s easy to register for.

Remember, it’s a fundamentals exam, so it doesn’t go into any great depth, and it doesn’t qualify you for anything, but it visits a wide variety of features available in Azure. Despite the lack of technical content, it takes quite a lot of time to prepare to pass the exam.

The Exam

My company uses Azure, and we are encouraged to take this cert. As mentioned above, if you attend the free Microsoft Virtual Training Days for the “Fundamentals” training, you get an exam voucher that will allow you to sit the exam for free.

The exam is multiple choice. There are traditional multiple choice questions, drag and drop questions, and sentence completion questions. There are between 40 and 60 questions and you need to score at least 700 out of 1000 points to pass. Some questions are simple True/False answers. Some are more complex, with multiple sub-questions, and are worth more points. The number of questions you get asked varies depending on how many of the complex multi-point questions you get asked. I read someone saying they had 30 questions, but the official site says 40-60. I had 37 questions in 45 minutes. πŸ™‚

There are questions that are not covered in any of the training materials I used, but you can hopefully guess at them, knowing what you do know. I believe the Microsoft Learn, Virtual Training Days and exam are due to be updated in April 2022. We were told the exam would track what is in the current training materials. That was not true. There were questions on subjects not in the current Microsoft Learn training materials, or the Virtual Training Days. Sneaky.

I’m sure some people will attend the Virtual Training Days course and pass the exam if they have great memory retention and get lucky with the questions that come up. For most people, I think just attending the training days course would result in a failure in the exam. I would not have passed with just the Virtual Training Days course.

BTW: During the Virtual Training Days course someone asked what happens if you fail the exam. The tutor said you can get another free exam voucher if you attend the training course again. Free is a good deal. πŸ™‚

Thoughts

I had a bit of a rant on Twitter while I was preparing for the exam, because I found the content really dull. I had a look through the equivalent exams for AWS and Oracle Cloud and I think I would have a similar experience with them also. The problem as I see it is cloud providers support a huge number of services, and most of them are of little interest to me. Despite that you have to visit loads of them for the fundamentals/foundations courses. The result is:

  • For things you are interested in, you don’t get any useful information.
  • You have to waste time learning stuff you are not interested in.

But it’s a fundamentals exam stupid! Yeah I know, but it doesn’t stop it being annoying.

In my company I will never get to define identity, governance, networking, pricing etc. At best I will be given a resource group and be told to work in there. As a result I couldn’t help but feel most of this was pointless.

As I said on Twitter,

“The vast majority of the cloud is amazingly boring, drab and shite! The fun thing is building solutions with it, not the boring fuckin’ paper work…”

Me, 2022

So I’m not a fan of these types of cloud certifications. I don’t want to do any more cloud certifications, but that’s just my opinion. I guess as long as they are free, you can decide for yourself. There is no way I would pay for this…

Cheers

Tim…

PS. I passed it.

PPS. I know I’m a grumpy old git, but it wasn’t my choice to do this certification, so that kind-of coloured my whole view of it.

PPPS. I’ve suggested to other people they should attend the Virtual Training Days course, and then decide if they want to revise for the exam. I think the course gives you a good idea of what you can do on Azure. Whether you want to spend the time to prepare for the exam is another matter.

How are you provisioning your databases on-prem and in the cloud? Poll results discussed.

Following on from my previous post, I wanted to discuss the results of the polls regarding database provisioning.

This was the first question I asked.

How are you provisioning your databases on-prem and in the cloud?

A couple of years ago I stopped putting GUI installation articles on my website. They look pretty and seem to get a lot of views, but I thought posting them was wrong because I never use GUI installations. Posting them felt like I was sending the wrong message. I wrote about that here. This was one of the reasons I lead with this question. I was pleased with the results of the poll.

  • Using GUI: I understand some people don’t want to take a step backwards to move forwards, but at nearly 20%, this number is still too high IMHO. You can’t check button clicks into version control. I’m sure some smart arse will tell me you can if you use Robotic Process Automation (RPA) to click them. πŸ™‚
  • Shell scripts: If I’m honest I thought 34% was on the low side. I was expecting more people to be using silent installations using shell scripts, but the number is lower because of the next option. If people have made a big investment into writing robust shell scripts with good error handling, I can understand the reluctance to move away from them. Ansible and Terraform are nice, but they are not magic. πŸ™‚
  • Ansible/Terraform/Other: This was actually the surprise of bunch. I wasn’t expecting this number to be so high, but I was pleasantly surprised. The previous post showed lots of people running their databases in the cloud, which has no doubt helped to drive the uptake of automation tools like Ansible and Terraform. Happy days!

Spurred on by a question from Jasmin Fluri, I asked the following question to drill down a little more.

For people using Ansible and/or Terraform, how automated is your process?

This was also a pleasant surprise.

  • We run it manually: I was expecting this to be way ahead of the pack, but at nearly 38% I was wrong, which made me happy. I have no problem if people are running Ansible or Terraform manually. A pipeline is just a bunch of building blocks threaded together. The fact people have taken the time to build these blocks is great. Threading them together is nice, but it’s not the “be all and end all”. The important bit is the definitions of the systems are in code!
  • Automated pipeline: Over 33% made me happy. My assumption was this would be lower, and I wrong and I’m glad.
  • Terrahawks was a cartoon: The people who picked this were wrong! Terrahawks was a kids TV show using puppets, not animation. I’m really surprised nobody noticed this. The community let me down! πŸ˜‰ If we discount this response from the mix, it makes the other two responses close to 50:50, which is cool.

On a bit of a tangent, I wanted to know how dominant Git was these days.

What Version Control System (VCS) are you suing for your database scripts and code?
  • Git – On Prem: I knew Git would dominate, but I wasn’t sure if people would be hosting their repositories on-prem or in the cloud. With a response of over 30%, that means nearly half of the Git users were hosting their repositories on-prem, which was higher than I expected.
  • Git – Cloud-based: I expected this to dominate, so 37% was a little lower than I expected. Only a little over half of the Git users were using cloud-based repositories. We use cloud-based Git repositories, but we always keep a backup on-prem. Just in case.
  • Other VCS – Not Git: I expected this to have a reasonable showing as VCS software like Subversion used to be really popular, so I knew things would linger. Nearly 19% isn’t bad. I don’t think there is anything wrong with using something other than Git, but Git has become so pervasive in tooling it probably makes sense to take the plunge.
  • VCS is for wimps: I’m hoping nearly 13% of the respondents were picking this to wind me up, but I suspect they weren’t. If you are not currently using version control, please start!

Version control is at the heart of automation, DevOps, Infrastructure as Code (IaC) and all that funky business, so if people can just get that right they have taken the first step on the journey.

So overall this makes very pleasant reading. Lots of people are provisioning databases using some form of scripting, rather than GUIs, and a bunch of people are automating that provisioning. This is what I wanted to hear.

Cheers

Tim…

PS. You know the caveats. These are small sample sizes. My audience has an Oracle bias. I’m no expert at automation, DevOps and the cloud. Just a keen dabbler.

Are you running production databases on the cloud? Poll results discussed.

It can be quite difficult to know if your impression of technology usage is skewed. Your opinion is probably going to depend on a number of factors including what you read, who you follow, and the type of company you work for. For this reason I asked some questions on Twitter the other day, just to gauge the response.

Let me start by saying, this is a small sample size, and most of my followers come from the Oracle community, including a number of Oracle staff. This may skew the results compared to other database engines, and technology stacks. I’m commenting on the results as if this were a representative sample, but you can decide if think it is…

So this was the first question I asked.

Is your company running production relational databases in the cloud?

We can see there was a fairly even spread of answers.

  • All prod DBs in cloud: A response of nearly 19% picking this option kind-of surprised me. I speak to a lot of people, and there always seems to be something they’ve got that doesn’t fit well in the cloud for them. Having this many people saying they’ve managed to make everything fit is interesting.
  • Some prod DBs in cloud: I expected this response to be high and with over 27% it was. When we add this to the previous category, we can see that over 46% of companies have got some or all of their production relational databases in the cloud. That’s a lot.
  • Not yet, but planned: At over 24%, when added to the previous categories, it would seem that over 70% of companies see some perceived value in running their databases in the cloud. Making that initial step can be difficult. I would suggest people try with a greenfield project, so they can test the water.
  • Over my dead body: At 29%, this is a lot of people that have no intention of moving their databases to the cloud at this moment in time. We might get some answers about why from the next question.

This was my second question.

What’s stopping you from moving your databases to the cloud?

Once again, we get a fairly even spread of responses.

  • Legal/Compliance: Over 17% of respondents have hit this brick wall. Depending on your industry and your country, cloud may not be an option for you yet. Cloud providers are constantly opening up data centres around the world, but there are still countries and regions that are not well represented. Added to that, some organisations can’t use public cloud. Most cloud providers have special regions for government or defence systems, but they tend to be focused in certain geographical regions. This is a show stopper, until the appropriate services become available, or some hybrid solution becomes acceptable.
  • Company Culture: At over 30%, this is a road block to lots of things. Any sort of technology disruption involves a change in company culture, and that’s one of the hardest things to achieve. It’s very hard to push this message from the bottom up. Ultimately it needs senior management who understand the need for change and *really* want to make that change. I say *really* because I get the feeling most management like to talk the talk, but very few can walk the walk.
  • Cloud is Expensive: At nearly 29%, this is an interesting one. The answer to the question, “is cloud more expensive?”, is yes and no. πŸ™‚ If you are only looking at headline figures for services, then it can seem quite expensive, but the cloud gives us a number of ways to reduce costs. Reserved instances reduce the cost of compute power. Selecting the correct shape and tier of the service can change costs a lot. Spinning down non-production services when they are not used, and down-scaling production services during off-peak hours can save a lot of money, and these are not things that necessarily result in a saving on-prem. I also get the impression many companies don’t work out their total cost of ownership (TCO) properly. They forget that their on-prem kit requires space, power, lighting, cooling, networking, staffing etc. When they check the price of a service on the cloud, it includes all that, but if you don’t take that into consideration, you are not making a fair comparison. Some things will definitely be cheaper on the cloud. Some things, not so much. πŸ™‚
  • Cloud Sucks: At nearly 23%, this is a big chunk of people. It’s hard to know if they have valid reasons for this sentiment or not. Let’s take it on face value and assume they do. If this were a reflection of the whole industry, it’s going to be interesting to see how these people will be won over by the cloud providers.

The comments resulted in a few interesting things. I’ve responded to some of them here.

  • “Lack of cloud skills.” We all have to start somewhere. I would suggest starting with small proof of concept (POC) projects to test the water.
  • “Unreasonable Oracle licencing restrictions.” In case you don’t know, the core factor doesn’t apply to clouds other than Oracle Cloud, which makes Oracle licensing more expensive on non-Oracle clouds. Of course, everything can be negotiated.
  • “Lack of availability of Cloud experts to assist/advise.” I’m sure there are lots of people that claim they would be able to help, but how many with a proven track record is questionable. πŸ™‚
  • “We have a massive legacy estate to consider.” Certainly, not everything is easy to move the the cloud, and the bigger your estate, the more daunting it is. I’m sure most cloud providers would love to help. πŸ™‚
  • “Latency with fat client applications.” I had this conversation myself when discussing moving some of our SQL Server databases to Azure. It can be a problem!
  • “Seasonal businesses with uncertain money flow may not able to meet the deadlines for subscription payments.” Scaling services correctly could help with this. Scale down services during low periods, and scale up during high periods.
  • “The prime fear is being pulled off from the grid. Undependable internet connections.” Sure. Not every place has dependable networking.
  • “Bandwidth requirements & limited customization possibilities.” Ingress and egress costs vary with cloud providers. It may be intelligent design of your processes can reduce the amount of data being pushed outside the cloud provider. The cloud is very customisable, so I’m not sure what the issue is here, but I’m sure there are some things that will be problematic to some people.

Overall I think this was an interesting exercise. Even five years ago I would have expected the responses to skew more in favour of on-prem. Barring some huge change in mindset, I would expect the answers to be even more in favour of cloud in another 5 years.

Regardless of your stance, it seems clear that familiarity with cloud services should be on your radar, if it’s not already. Your current company may not be fans of the cloud, but if you change jobs the cloud may be a high priority for your new company.

Cheers

Tim…

PS. I’ve been running my website on AWS since 2016 . I started to write about some services on AWS and Azure in 2015. I’ve been playing with Oracle Cloud since its inception in 2016 (I think). Despite all this, I consider myself a dabbler, rather than an expert.

Remembering the bad old days of shared hardware…

I’m in the middle of a conversation with my boss about some old shared kit and it reminded me of the bad old days of shared hardware.

Nowadays we try to keep things really simple, with each VM/container serving a single purpose. Containers are great for this, because they allow the ultimate in granularity without the overhead associated with VMs.

Back in the old days we often had servers with loads of crap installed on them. Multiple versions of the database. Multiple versions of application servers. It was quite common to mix and match completely different tech stacks. The net result was you had a whole bunch of dependencies that meant you couldn’t change one thing without breaking everything else.

I’m sure some people have fond memories of those days, but I would suggest they have rose coloured spectacles. It was horrible, and I’m so glad we don’t do that anymore.

We have a couple of bits of shared kit left in our company, but once those are gone I’m going to purge the whole concept of shared kit from my brain and move on.

Cheers

Tim…

Why Automation Matters : It’s Not New and Scary!

It’s easy to think of automation as new and scary. Sorry for stating the obvious, but automation may be new to you, or new to your company, but plenty of people have been doing this stuff for a long time. I’m going to illustrate this with some stories from my past…

Automated Deployments

In 2003 I worked for a parcel delivery company that were replacing all their old systems with a Java application running against an Oracle back end. Their build process was automated using Ant scripts, which were initiated by a tool called Ant Hill. Once developers committed their code to version control (I think we used CVS at the time) it was available to be included in the nightly builds, which were deployed automatically by Ant Hill. Now I’m not going to make out this was a full CI/CD pipeline implementation, but this was 19 years ago, and how many companies are still struggling to do automated builds now?

Automated Installations

Back at my first Oracle OpenWorld in 2006 I went to a session by Dell, who were able to deploy a 16 node Oracle RAC by just plugging in the physical kit. They used PXE network installations, which included their own custom RPM that performed the Oracle RAC installation and config silently. The guy talking about the technical stuff was Werner Puschitz, who was a legend in the Oracle on Linux space back in the day. I wrote about this session here. This was 16 years ago and they were doing things that many companies still can’t do today.

I can’t remember when the Oracle Universal Installer (OUI) first allowed silent installations, but I’m pretty sure I used them for the first time in Oracle 9i, so that’s somewhere around the 2001 period. I have an article about this functionality here. I think Oracle 9.2 in 2002 was the first time the Database Configuration Assistant (DBCA) allowed silent installations, but before the DBCA we always used to create databases manually using scripts anyway, so silent database creations in one form or another have been possible for well over 20 years. You can read about DBCA silent mode here. Build scripts for Oracle are as old as the hills, so there is nothing new to say here. The funny thing is, back in the day Oracle was often criticised for not having enough GUI tools, and nowadays nobody wants GUI tools. πŸ™‚

Sorry, but if you are building stuff manually with GUIs, it kind-of means you’re a noob. If consultants are building things manually for you, they are wasting your time and need to be called out on it. At minimum you need build scripts, even if you can’t fully automate the whole process. A deliverable on any project should be the build scripts, not a 100 page word document with screen shots.

Random – Off Topic

While writing this post I thought of a recent conversation with a friend. He was showing me videos of his automated warehouse. It had automated guided vehicles (AGVs) zipping around the warehouse picking up products to ship. It was all new and exciting to him. We were laughing because in 1996 I was renting a room in his house, and my job at the time was writing software for automated warehouses using Oracle on the back end. It wasn’t even a new thing 26 years ago. One of the projects I worked on was upgrading an existing automated warehouse that had already been in operation for about 10 years, with AGVs and automated cranes.

New is a matter of perception.

Final Thoughts

I’m not saying all this stuff in an attempt to make out I’m some kind of automation or DevOps thought leader. If you read my blog, you know all about me. I’m just trying to show that many of us have a long history in automation, even if we can’t check all the boxes for the latest buzzwords. Automation is not new and scary. It’s been part of the day-to-day job for a long time. In some cases we are using newer tools to tidy up things that were either already automated, or at least semi-automated. If someone is presenting this stuff like it’s some brave new world bullshit, they are really trying to pull the wool over your eyes. It should be an evolution of what you were already trying to do…

I wrote a series of posts about automation here.

Cheers

Tim…

Why Automation Matters : Why You Will Fail!

The biggest problem you are likely to encounter with any type of change is people!

People don’t want to change, even if they say they do. You would think an industry that is based on constant innovation would be filled with people who are desperate to move forward, but that’s not true. Most people like the steady state. They want to come to work today and do exactly what they did yesterday.

Automation itself is not that difficult. The difficult part is the culture change required. There is a reason why new startup companies can innovate so rapidly. They are staffed by a small number of highly motivated people, who are all excited by the thought of doing something new and different. The larger and more established a company becomes, the harder it is to innovate. There are too many people who are happy to make do. Too many layers of management who, despite what they say in meetings, ultimately don’t want the disruption caused by change. Too many people who want to be part of the process, but spend most of their time focussing on “why not” and (sometimes unknowingly) sabotaging things, rather than getting stuck in. Too many people who suck the life out of you.

It’s exhausting, and that’s one of the worst things about this. It’s easy to take someone who is highly motivated and grind them down to the point where there is no more fight left in them, and they become a new recruit to the stationary crowd.

I’ve been around long enough to know this is a repeating cycle. When I started working in tech I encountered people telling me why relational databases were rubbish. Why virtualization was rubbish. Why NoSQL is rubbish. More recently why Agile is rubbish. Why containers are rubbish. Why cloud is rubbish. Why CI/CD is rubbish. Why DevOps is rubbish. The list goes on…

I’m not saying everything “new” is good and everything old is trash. I’m just saying you have to give things a proper go before you make these judgements. Decide what is the right tool for the job in question. Something might genuinely not be right for you, but that doesn’t mean it is crap for everyone. It also doesn’t mean it might not be right for you in the next project. And be honest! If you don’t want to do something, say you don’t want to do it. Don’t position yourself as an advocate, then piss on everyone’s parade!

I’m convinced companies that don’t focus on automation will die. If you have people trying to move your company forward, please support them, or at least get out of their way. They don’t need another hurdle to jump over!

I wrote a series of posts about automation here.

Cheers

Tim…

Why Automation Matters : Dealing With Vulnerabilities

The recent Log4j issues have highlighted another win for automation, from a couple of different angles.

Which Servers Are vulnerable?

There are a couple of ways to determine this. I guess the most obvious is to scan the servers and see which ones ping for the vulnerability, but depending on your server real estate, this could take a long time.

An alternative is to manage your software centrally and track which servers have downloaded and installed vulnerable software. This was mentioned by a colleague in a meeting recently…

My team uses Artifactory as a central store for a lot of our base software, like:

  • Oracle Database and patches.
  • WebLogic and patches.
  • SQLcl
  • ORDS
  • Java
  • Tomcat

In addition the developers use Artifactory to store their build artifacts. Once the problem software is identified, you could use a tool like Artifactory to determine which servers contained vulnerable software. That would be kind-of handy…

This isn’t directly related to automation, as you could use a similar centralised software library for manual work, but if you are doing manual builds there’s more of a tendency to do one-off things that don’t follow the normal procedure, so you are more likely to get false negatives. If builds are automated, there is less chance you will “acquire” software from somewhere other than the central library.

Fixing Vulnerable Software

If you use CI/CD, it’s a much simpler job to swap in a new version of a library or package, retest your software and deploy it. If your automated testing has good coverage, it may be as simple as commit to your source control. The quick succession of Log4j releases we’ve seen recently would have very little impact on your teams.

If you are working with containers, the deployment process would involve a build of a new image, then replacing all containers with new ones based on the new image. Pretty simple stuff.

If you are working in a more traditional virtual machine or physical setup, then having automated patching and deployments would give you similar benefits, even though it may feel more clunky…

Conclusion

Whichever way you play it, the adoption of automation is going to improve your reaction time when things like this happen again in the future, and make no mistake they will happen again!

I wrote a series of posts about automation here.

Cheers

Tim…

Log4j Vulnerabilities : My Random Thoughts

For technical information keep checking the Apache Log4j Security Vulnerabilities page for updates.

Someone on Twitter asked me to write something about the Log4j issue and my response was I’m not really qualified to do that. After some consideration I thought maybe my uneducated opinion would be useful to others, so here goes…

Basic Context

This is a variation of something I wrote on an internal P1 incident, to give people some context. Remember, there are a range of people reading this P1, so it was written to be understandable to a wide audience.

  • Log4j is an open source library used for logging in many Java applications. If you are not using Java apps, you are not using Log4j, so you are safe. If you are using Java apps, the vendor may not have used Log4j to do logging. This is why it is important to scan servers and check with the vendor to see if their software is vulnerable or not.
  • This is not an issue with Apache HTTPS Server. Apache is a software foundation, which manages many commonly used open source software products, including the Apache HTTP Server. When you see “Apache Log4j”, the word “Apache” is a reference to the software foundation, not the HTTP server. As a result, it’s not safe to assume that if you don’t use Apache HTTP Server you are safe.
  • Client applications running on your PC are low risk compared to server applications. Most of the attacks are based around sending requests containing dodgy payloads to application servers. Your local PC applications don’t handle such requests, so are extremely unlikely to be affected. They should still be patched as soon as patches are available, but you don’t need to obsess about them.

Here’s a quick summary:

  • Not a Java application. Don’t worry.
  • Java application that doesn’t use Log4j. Don’t worry.
  • Java application that uses Log4j 1.x. Don’t worry about these vulnerabilities. Of course, older code may be susceptible to other vulnerabilities.
  • Java application that uses Log4j 2.x. Java 8 (or later), upgrade to Log4j release 2.17.1*.
  • Java application that uses Log4j 2.x. Java 7, upgrade to Log4j release 2.12.4*.
  • If upgrading Log4j is not an immediate option, maybe you are waiting for a vendor to release a patch, consider mitigations until upgrades are possible.

* These versions were correct at the time of writing. Keep checking the Apache Log4j Security Vulnerabilities page for updates.

Mitigations are not Solutions

Upgrading Log4j is the only way to be sure.

  • Java 8 (or later) users should upgrade to Log4j release 2.17.1.
  • Java 7 users should upgrade to Log4j release 2.12.4.

In the early days of the vulnerabilities, most people focused on mitigations. Probably the most common was to add this JVM parameter.

-Dlog4j2.formatMsgNoLookups=true

Or to set this environment variable.

LOG4J_FORMAT_MSG_NO_LOOKUPS=true

These worked for the initial vulnerability, but don’t stop all attacks. They are listed as “discredited” on the Apache Log4j Security Vulnerabilities page. It’s still worth doing this while you wait for patches from vendors, but this only limits your exposure. It’s not a complete fix. Do not do this and assume it’s game over!

Another option was to remove the JndiLookup.class completely. This is still listed as a valid mitigation if you are not able to upgrade. It may seem scary, but if a vendor patch is not forthcoming, you need to weigh up the risks.

zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class

In addition to the direct mitigations, you also need to consider the bigger picture. Applications that are available to the outside world are clearly at enormous risk. If applications are only available inside your company network, then the risk is reduced. I’m not saying ignore internal applications, but prioritise the higher risk systems first maybe?

Which of my systems are at risk?

If you work in a mixed shop with lots of 3rd party products, that is not necessarily an easy question to answer. You can’t just search the file system for *log4j* and think that’s good enough. The log4j libraries are often deployed inside other JAR files or zip files.

I’ve been using the log4j-detector tool by MergeBase to scan servers. They’ve been releasing new versions regularly since this issue started. It seems to work well, but I’m sure other tools are available.

On Linux servers all my software is under “/u01”, so I scan like this.

java -jar /tmp/log4j-detector-2021.12.22.jar /u01

I don’t have many Windows servers, but here’s an example of a command I used to scan an Artifactory server. Notice it’s not a standalone Java installation on this server, but one shipped as part of Artifactory.

"E:\jfrog\artifactory\app\third-party\java\bin\java.exe" -jar log4j-detector-2021.12.22.jar E:\jfrog

I would suggest you scan systems, even if your Vendor says they are safe. You never know what additional software has been installed by someone.

Log4j Developers

I’ve read a number of comments where people have criticised the Log4j developers, and for the most part I think they are totally out of order. The vast majority of companies have no engagement with open source software. They don’t commit code, and they don’t donate money to the projects they rely on. If you are just taking without giving anything back, I feel like you are not in a position to complain.

I understand some of the criticisms from a technical perspective, but hindsight is a wonderful thing. You could have spent time looking through the source code and highlighted stuff you didn’t like, but you didn’t. You could have got involved, but you didn’t.

I suspect there are developers of other common libraries checking their code for “exotic” features that need to be turned off by default…

Open Source

I’ve seen some people using the recent Log4j issues as a way to attack open source on a more general level. If people couldn’t see the source, they wouldn’t find the exploits right? I don’t buy this. Security through obscurity is not security. I wonder how much longer these vulnerabilities would have existed if the source code was not freely available?

Vendor Reactions

The reaction of vendors has been really interesting. Some have been really quick to react. Some, not so much. At times like this it’s really important that vendors release a statement as soon as possible, even if that is a message that says they are aware of the issue and are investigating. Your typical “watch this space” message. If your vendors were slow to react, or didn’t react at all, then I think you need to question whether you should be working with their products.

This is true for vendors of products that don’t even use Java. In addition to scanning, we have been compiling statements from vendors regardless of their technology stack. For a vulnerability this high profile, I think it’s important all vendors release a statement. It may sound ridiculous to you, but not every person involved in the process has a grasp of what technology stack is used by each product. If a vendor provides a clear statement, then it makes life a lot easier.

Oracle Advisory

The Oracle advisory came out pretty quick, and has been updated frequently over the last week as more patches have been released. Keep an eye on it over the next few days, as I expect some existing patches will be reissued with Log4j 2.17.

https://www.oracle.com/security-alerts/alert-cve-2021-44228.html

You still need to use your brain when determining the risk. The Oracle database is marked as not vulnerable, but there are some items shipped with the database that use vulnerable log4j versions. For example SQL Developer is shipped with the database, but this is a client tool. It is not receiving HTTPS requests from users, so it’s not a threat. There are patched versions of this available, but do you care? In a similar manner Enterprise Manager is vulnerable, and you should patch it, but it shouldn’t be accessible publicly, so the threat is lower. The chances are only your DBAs have firewall access to this server, so it represents a smaller threat than a public facing application server.

Conclusion

It has been a shit show, and there are little signs of it calming down much before Christmas, but you have to use this as a learning experience. Please apply patches as soon as they are available. If your vendor is slow off the mark, apply the mitigations while you wait.

As I’ve said, I’m not an expert, just someone trying to cope with these issues. If you see anything you think is factually incorrect, please tell me so I can correct it.

Cheers

Tim…

Windows 11 : My first few days…

I’ve been using Windows 11 for a few days now and I thought I would give my impressions so far.

Installation

I picked the upgrade from my Windows Update screen and it just worked. I didn’t have any dramas from the upgrade. After the upgrade I had two or three rounds of Windows Updates that needed reboots, but I kind-of expected that.

I’m sure people with older kit will have some different experiences, but on this Dell XPS 15″ with an i9 (6 cores), 32G RAM and a NVME M.2 drive things went fine.

First Impressions

I have macOS now… πŸ™‚

The most striking thing is the change to the taskbar. It’s reminiscent of the macOS dock when it is idle. All the items are centralised, but you can move them to the left if you prefer that. When you compare Windows 11 to macOS Big Sur they look nothing like each other, but you get the vibe Microsoft were “inspired” by that look.

When you click the Windows button/key you get a much more streamlined start menu, which was a bit of a shock at first, but I think I prefer it. One gripe is all the stuff I had pinned to the start menu was lost after the upgrade, and replaced with bullshit I don’t care about. It only took a few minutes to sort that though.

Once you start using the OS it feels like Windows 10, but with rounded corners. There is a lot more consistency with the “design language” of the interface. Many of the common dialogs have been reworked to be consistent with the new look and feel, but there are still a bunch of things that never seem to change. Open up “Computer Management” and it feels kind-of jarring. It doesn’t follow the theme and it feels like you’ve switched back several versions of Windows. It’s not a problem, as most of the common dialogs are fine, but it is a little disappointing.

Unlike the super-glassy finish of Windows Vista, there is some transparency on certain menus in Windows 11, but it is very subtle.

Hiccups

I had a few hiccups along the way. They were all quite minor really.

  • The upgrade killed the VPN client I use for work. I had to uninstall it and install it again. The solution was pretty simple, but I was kind-of tense for a while.
  • The upgrade uninstalled “Teams for Work and School” and replaced it with the consumer version of Teams. That meant I couldn’t connect with anyone from work. I downloaded and installed “Teams for Work and School” and it was all good.
  • As I mentioned before, all the things I had pinned to the start menu were lost and I had to remove a load of crap and re-pin things.

None of these things were drama, but if you were under a time constraint you may find yourself swearing at the computer!

Heavy Usage

Minecraft works! πŸ™‚

Most of my heavy use revolves around VirtualBox, Vagrant and Packer. I’ve built some new Vagrant boxes using Packer, and used those boxes for Vagrant builds of VirtualBox VMs, and I haven’t run into any problems yet.

I record and edit videos using Camtasia, and it seems happy running on Windows 11.

Most of my life is not spent doing process heavy things. I spend most of my time in a browser or a shell prompt. I connect to Linux boxes at home and at work using MobaXTerm. I’ve had no dramas with this day-to-day stuff.

I had a look on the interwebs and a few gamers have been complaining about Windows 11, so if you are a PC gamer, now might not be a good time to make the switch from Windows 10.

Overall Impressions

It’s the same, but different. The safe approach is to stick with Windows 10 for a few more years. I don’t think you are missing out on anything by doing that. If you fancy the jump to Windows 11 and you have reasonably new kit, go for it.

Cheers

Tim…