Introduction
Directory Traversal in Apache Flink version 1.11.0, 1.11.1, and 1.11.2 has been found and registered as #CVE-2020-17519
What is Apache Flink?
#Apache #Flink is an #open-source, unified stream-processing and batch-processing framework developed by the Apache Software Foundation. The core of Apache Flink is a distributed streaming data-flow engine written in Java and Scala. Flink executes arbitrary dataflow programs in a data-parallel and pipelined manner.
https://www.shodan.io/search?query=flink+port%3A8081
Background Story
After I figured out how to debug Apache Flink and edit the configuration so I can debug it remotely which gives the huge advantage of following the request step by step. However I had to figure out where I want to set the breakpoints, I started reading through the code (I almost read everything :D), and from previous experience and reading the API doc I figured out to check the router class, and there I found 6 classes.
The classes are not directly related however one of them handles and process the incoming HTTP requests and routes them to the right handler also to build more of a clear idea I had to understand those 6 classes more in detail.
After that being said, I found that channelRead0 it’s an interesting method, so I added a breakpoint there, I sent the request that triggers the vulnerability, and I started to step-in the program.
After a lot of stepping-in, I found the implemented method that decodes the URL and gets the path from it, also I found the method that read the file and how it gets loaded.
Build the lab
Install the system and prerequisites
OS: Ubuntu Server 20.04
You will need to install maven
sudo apt update
sudo apt install default-jdk
Download it from the following link: https://dlcdn.apache.org/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz
sudo tar xf apache-maven-3.2.5-bin.tar.gz -C /opt
sudo ln -s /opt/apache-maven-3.2.5 /opt/maven
sudo vim /etc/profile.d/maven.sh
export JAVA_HOME=/usr/lib/jvm/default-java
export M2_HOME=/opt/maven
export MAVEN_HOME=/opt/maven
export PATH=${M2_HOME}/bin:${PATH}
sudo chmod +x /etc/profile.d/maven.sh
source /etc/profile.d/maven.sh
mvn -version
Install apache flink
Download apache flink from the following link:
unzip flink-release-1.11.0.zip
cd flink-release-1.11.0
mvn clean package -DskipTests
Setup the debugger
Open flink-conf.yaml
Add the following:
env.java.opts: "-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=1337"
By that, the debugger will be able to connect to port 1337.
Now run the solution
./build-target/bin/start-cluster.sh
You can check it in your browser
http://localhost:8081/#/overview
Now run the debugger in this file
Reproduce the vulnerability
Once it’s all installed and ready you should be able to easily reproduce the vulnerability by browsing the following link:
Static Analysis
Explaining the code
From reading the document and going through whatever information I can find about this CVE, I know this is happening in the REST API.
You can notice here that the endpoint “jobmanager/logs” is part of the REST API.
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/
While you are reading in the doc, especially here, you can find that the REST API backend is in the flink-runtime.
From previous experience, for example, the analysis of Joomla CVE-2023–23752 (https://www.vicarius.io/vsociety/blog/cve-2023-23752-joomla-unauthorized-access-vulnerability)
I know that there is some routing function, and route handlers that handle and process the request.
Going through the folders and files and reading a lot of code, under flink-runtime/src/main/java/org.apache.flink.runtime, I found a folder named “rest” and I figured out it may be referring to rest API, and I found a handler folder there and there I found route
I started to read the codes in each file one by one.
MethodlessRouter
The MethodlessRouter
class has the following components:
The
routes
map is a map ofPathPattern
objects to target objects.The
routes()
method returns an unmodifiable map of all routes in the router.The
addRoute
method adds a new route to the router by creating a newPathPattern
object for the specified path pattern and adding it to the map of routes.The
removePathPattern
method removes the route specified by the path pattern.The
route
method takes a URI, a decoded path, a map of query parameters, and an array of path tokens as input, and returns aRouteResult
object that contains the target object for the matched route, along with any path parameters and query parameters. It loops through the map of routes and checks if the path tokens match any of thePathPattern
objects. If a match is found, the target object is returned along with any path parameters and query parameters.The
anyMatched
method checks if there is any matching route for the given array of path tokens. It loops through the map of routes and checks if the path tokens match any of thePathPattern
objects. If a match is found, it returnstrue
.The
size
method returns the number of routes in the router.
PathPattern
PathPattern
which represents a pattern used to match a URL path. The class takes a path pattern as input in its constructor and creates a list of tokens from the pattern. The pattern can contain constants or placeholders, and if it exists, the placeholder with the format:*
is a special placeholder to catch the rest of the path (may include slashes).
The class has two instance variables,
pattern
andtokens
, both of which are final and set in the constructor.pattern
is a string representing the pattern without slashes at both ends.tokens
is an array of strings representing the pattern split by the/
character, for example:constant1/constant2?foo=bar
PathPattern()
constructor creates a newPathPattern
object from aString pattern
. It checks if the pattern contains a query, removes slashes from both ends of the pattern usingremoveSlashesAtBothEnds()
, and splits the pattern into tokens.
removeSlashesAtBothEnds()
This is a static utility method that removes slashes from both ends of a path. It takes aString path
, checks if it is empty, finds the first non-slash character, finds the last non-slash character, and returns the substring between them.
Match()
Params will be updated with params embedded in the request path.
This method is designed so that requestPathTokens and params can be created only once then reused, to optimize for performance when a large number of path patterns need to be matched.
Returns: false if not matched; in this case params should be reset
RoutedRequest.Java
This contains PathPattern
and this is the same class as we explained it before and RoutedRequest
class and this class is for handling HttpRequest
with associated RouteResult
.
Router
I will not go through the code for this, the doc explaining it in a very well way.
RouteResult
RouteResult
is a class that represents the result of calling the Router#route(HttpMethod, String)
method. It contains information about the matched route, such as the original request URI, the decoded request path, path parameters, and query parameters. It also holds a reference to the target that will handle the request.
The
RouteResult
class is defined with a generic typeT
which represents the target that will handle the request.The
RouteResult
class has several instance variables:uri
: represents the original request URI.decodedPath
: represents the decoded request path.pathParams
: a map that contains all the path parameters embedded in the request path.queryParams
: a map that contains all the query parameters in the request URI.target
: the target that will handle the request.
The
RouteResult
class provides several methods to get the parameters from the path and query parameters:queryParam(name)
: extracts the first matching parameter in thequeryParams
.param(name)
: extracts the parameter inpathParams
first, then falls back to the first matching parameter inqueryParams
.params(name)
: extracts all parameters inpathParams
andqueryParams
matching the name.
RouteHandler
RouterHandler
class is an inbound handler that converts a HttpRequest
to a RoutedRequest
and passes the RoutedRequest
to the matched handler. It also replaces the standard error response to be identical with those sent by the AbstractRestHandler
.
ROUTER_HANDLER_NAME
andROUTED_HANDLER_NAME
are constants used as names for the handler in the Netty pipeline.LOG
is a logger instance for logging debug or trace information about the handler.responseHeaders
is a map containing headers to be included in the HTTP response.router
is an instance of theRouter
class which is used to route incoming requests to their respective handlers.The
RouterHandler
constructor takes aRouter
and a map of headers as parameters and initializes therouter
andresponseHeaders
fields accordingly.getName()
is a method that returns the name of the handler.The
RouterHandler
class overrides thechannelRead0
method, which is called by Netty whenever a new message is received on the channel. This method is responsible for routing incoming requests to their respective handlers and generating responses.The overridden
channelRead0
method first checks if the request expects 100-continue, and if so, sends a continue response and returns.
The HTTP 100 Continue informational status response code indicates that everything so far is OK and that the client should continue with the request or ignore it if it is already finished.
It then extracts the HTTP method and path from the request using
httpRequest.getMethod()
andhttpRequest.uri()
, respectively. It passes the method and path to therouter.route
method to obtain aRouteResult
object, which contains the matched handler and any path or query parameters.If the
routeResult
object is notnull
, therouted
method is called to pass the request to the matched handler, otherwise, it will callrespondNotFound
and sendNot Found
response.The
routed
method retrieves the handler from therouteResult
object, and adds it to the Netty pipeline usingpipeline.addAfter
orpipeline.replace
, depending on whether the handler was already added to the pipeline.Finally, it creates a new
RoutedRequest
object using therouteResult
andhttpRequest
, and passes it to the next channel handler in the pipeline usingchannelHandlerContext.fireChannelRead
.
After we explained all the involved classes, you can understand why I thought RouterHandler
is the most interesting one for me.
Debugging
Before we start the debugging, don’t forget to run the debugger you added before
Add a breakpoint here:
Now send the request, or just simply go to
http://localhost:8081/jobmanager/logs/..%252f..%252f..%252f..%252f..%252f..%252f..%252f..%252f..%252f..%252f..%252f..%252fetc%252fpasswd
instantly, you will see something like this, and you can notice the httpRequest with everything else related such as method, URI…etc.
Keep your eyes on the code and debugger.
Now step-in.
How does the URL get decoded and read as path
Once you reach to this line number 82 in RouterHandler.java
RouteResult<?> routeResult = router.route(method, qsd.path(), qsd.parameters());
Click the step-in button, and you will notice that route, path(), and parameter got highlighted.
Click on path()
This will lead you to QueryStringDecoder.class
this method:
Now click step-in
Step in again and this will take you to decodeComponent
method
This is basically a method that decodes a portion of a string that may contain URL-encoded characters.
private static String decodeComponent(String s, int from, int toExcluded, Charset charset, boolean isPath) {
int len = toExcluded - from;
if (len <= 0) {
return "";
} else {
int firstEscaped = -1;
int decodedCapacity;
for(int i = from; i < toExcluded; ++i) {
decodedCapacity = s.charAt(i);
if (decodedCapacity == 37 || decodedCapacity == 43 && !isPath) {
firstEscaped = i;
break;
}
}
if (firstEscaped == -1) {
return s.substring(from, toExcluded);
} else {
CharsetDecoder decoder = CharsetUtil.decoder(charset);
decodedCapacity = (toExcluded - firstEscaped) / 3;
ByteBuffer byteBuf = ByteBuffer.allocate(decodedCapacity);
CharBuffer charBuf = CharBuffer.allocate(decodedCapacity);
StringBuilder strBuf = new StringBuilder(len);
strBuf.append(s, from, firstEscaped);
for(int i = firstEscaped; i < toExcluded; ++i) {
char c = s.charAt(i);
if (c != '%') {
strBuf.append(c == '+' && !isPath ? ' ' : c);
} else {
byteBuf.clear();
do {
if (i + 3 > toExcluded) {
throw new IllegalArgumentException("unterminated escape sequence at index " + i + " of: " + s);
}
byteBuf.put(StringUtil.decodeHexByte(s, i + 1));
i += 3;
} while(i < toExcluded && s.charAt(i) == '%');
--i;
byteBuf.flip();
charBuf.clear();
CoderResult result = decoder.reset().decode(byteBuf, charBuf, true);
try {
if (!result.isUnderflow()) {
result.throwException();
}
result = decoder.flush(charBuf);
if (!result.isUnderflow()) {
result.throwException();
}
} catch (CharacterCodingException var16) {
throw new IllegalStateException(var16);
}
strBuf.append(charBuf.flip());
}
}
return strBuf.toString();
}
}
}
The method takes the following variables a string s
, a starting index from
, an ending index toExcluded
, a character set charset
, and a boolean isPath
.
private static String decodeComponent(String s, int from, int toExcluded, Charset charset, boolean isPath) {
The length of the portion of the string to decode is calculated as the difference between the ending index and the starting index. If the length is zero or negative, an empty string is returned. Otherwise, the decoding process begins.
int len = toExcluded - from;
if (len <= 0) {
return "";
} else {
The firstEscaped
variable is set to -1 to indicate that no URL-encoded characters have been found yet. The loop iterates over the portion of the string to decode and checks each character. If a character is either a percent sign (`%`) or a plus sign (`+`) and the isPath
flag is false (indicating that the string is not a URL path), the firstEscaped
variable is set to the index of the character and the loop breaks.
int firstEscaped = -1;
int decodedCapacity;
for(int i = from; i < toExcluded; ++i) {
decodedCapacity = s.charAt(i);
if (decodedCapacity == 37 || decodedCapacity == 43 && !isPath) {
firstEscaped = i;
break;
}
}
If no URL-encoded characters were found, the entire portion of the string is returned unmodified using the substring
method. Otherwise, the decoding process continues.
if (firstEscaped == -1) {
return s.substring(from, toExcluded);
} else {
...
}
A
CharsetDecoder
object is created using the provided character set.The
decodedCapacity
variable is set to the maximum number of bytes
that could be required to represent the URL-encoded portion of the string in the given character set.Byte and character buffers are allocated to hold the decoded data.
StringBuilder
is created to accumulate the decoded characters.Finally,
strBuf.append(s, from, firstEscaped)
that appends a substring of the original input strings
to theStringBuilder
objectstrBuf
.
CharsetDecoder decoder = CharsetUtil.decoder(charset);
decodedCapacity = (toExcluded - firstEscaped) / 3;
ByteBuffer byteBuf = ByteBuffer.allocate(decodedCapacity);
CharBuffer charBuf = CharBuffer.allocate(decodedCapacity);
StringBuilder strBuf = new StringBuilder(len);
strBuf.append(s, from, firstEscaped);
The last part of the method decodes any URL-encoded characters found in the string and appends the resulting decoded characters to the StringBuilder
that will be returned as the decoded string.
The loop starts at the index of the first URL-encoded character found earlier (`firstEscaped`) and iterates over each character in the remaining portion of the string to decode. If the character is not a percent sign (`%`), it is appended to the StringBuilder
directly. If it is a percent sign, it indicates the start of a URL-encoded sequence, and the byte buffer is cleared.
The loop then reads the two hexadecimal digits that follow the percent sign in the input string, converts them to a byte value, and appends that byte to the byte buffer. This process continues until a non-percent character is found or the end of the string is reached. If the end of the string is reached before a terminating percent sign is found, an exception is thrown.
for(int i = firstEscaped; i < toExcluded; ++i) {
char c = s.charAt(i);
if (c != '%') {
strBuf.append(c == '+' && !isPath ? ' ' : c);
} else {
byteBuf.clear();
do {
if (i + 3 > toExcluded) {
throw new IllegalArgumentException("unterminated escape sequence at index " + i + " of: " + s);
}
byteBuf.put(StringUtil.decodeHexByte(s, i + 1));
i += 3;
} while(i < toExcluded && s.charAt(i) == '%');
--i;
byteBuf.flip();
charBuf.clear();
CoderResult result = decoder.reset().decode(byteBuf, charBuf, true);
try {
if (!result.isUnderflow()) {
result.throwException();
}
result = decoder.flush(charBuf);
if (!result.isUnderflow()) {
result.throwException();
So in other words what happens is that it iterates through the URL or the path which you will see as a the value in variable s
and It will decode it twice, once decode the double URL encoding, after that, it re-decodes it, so we have now the normal passwd path and that’s because it checks that there’s a character indicating that this is an URL-encoded value.
You can see here, it’s getting decoded from ..%252f
to ..%2f
Watch the video here for more understanding:
Once you get here
Hit another step-in, you suppose to get here (if not, just go to MethodlessRouter.java and add a breakpoint at line 94)
You will find yourself here:
You can see the pathParams
variable it’s basically a filename and it maps to ../../../../../../../../../../../../etc/passwd
unmodifiableMap
Returns an unmodifiable view of the specified map. This method allows modules to provide users with “read-only” access to internal maps. Query operations on the returned map “read through” to the specified map, and attempts to modify the returned map, whether direct or via its collection views, result in an UnsupportedOperationException.The returned map will be serializable if the specified map is serializable.
How is the file read and accessed
Keep following the debugger, you suppose to reach this line here:
This is amazing because the value of file
is
/home/us1/Desktop/flink-release-1.11.0/flink-dist/target/flink-1.11.0-bin/flink-1.11.0/log/../../../../../../../../../../../../etc/passwd
Just to show you how this work, go to your terminal and cat
this path, and it will cat the passwd file.
Keep stepping-in
You will start seeing information about the file in the debugger, such as the file length, the permissions (writable, readable, append), path, open or close ..etc.
The function that loads the file
Keep following the debugger, and you will reach this snippet of code:
and from here, another step-in, you will get to the method where it loads the file.
You can see here the filename and the passwd path.
and just for extra information, this will use File. java which creates a new File instance from a parent abstract pathname and a child pathname string.
Finally, you will notice that the content of passwd got sent to the browser
Mitigation
Any version after apache flink 1.11.2 is fixed.
Patch Diffing
We can see the changes here:
So basically they made changes on JobManagerCustomLogHandler.java
.
This line
String filename = handlerRequest.getPathParameter(LogFileNamePathParameter.class);
This will get only the name of the file, so if the attacker tried to achieve directory traversal the path won’t be the following as we saw it before
/home/us1/Desktop/flink-release-1.11.0/flink-dist/target/flink-1.11.0-bin/flink-1.11.0/log/../../../../../../../../../../../../etc/passwd
we would get passwd
only as the file name as a result, the method would not be able to access the sensitive file outside the intended directory structure, and the attack would be prevented.
Final Thoughts
It’s not really that complicated vulnerability as a concept, all that happened is that a get file function getting injected with a specific path ../../../../etc/passwd
and it’s very normal that it will follow this path and load the file.
However, what makes this breakdown complicated is that I’m trying to achieve what I like to call “deep understanding” and that’s because I’m VSOCIETY ELITE 1337 member 😈, but I like to understand what happened exactly, when, how, and why.
I would suggest you follow the debugging steps because it will make sense along with the code explanation.
The patch diffing was pretty much straightforward, I like patch diffing!
Resources:
About Version 2
Version 2 is one of the most dynamic IT companies in Asia. The company develops and distributes IT products for Internet and IP-based networks, including communication systems, Internet software, security, network, and media products. Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.
About Topia
TOPIA is a consolidated vulnerability management platform that protects assets in real time. Its rich, integrated features efficiently pinpoint and remediate the largest risks to your cyber infrastructure. Resolve the most pressing threats with efficient automation features and precise contextual analysis.