
Introduction
Command injection in #Apache #Kylin has been found and registered as #CVE-2021-45456
Background Story
The basic story behind this vulnerability is that the user can create a project, and dump diagnosis information of that project. in order for the solution to dump the diagnosis information it executes a script. Since the project name is controlled by the user, the user can enter the project name as a Linux command but without characters or spaces, after that When the user sends the request of the diagnosis, can modify the project name (i.e. the Linux command) and add spaces and other needed characters but URL-encoded so the command will be a valid command. The solution will process this request, decode the project name, and treat it as a Linux command in the execution process, therefore, it will execute the malicious payload.Build the lab
I’m using docker on Ubuntu server 20.04Install docker
apt update
apt install docker docker-compose
Install Apache Kylin
docker pull apachekylin/apache-kylin-standalone:4.0.0
sudo docker run -d \
-m 8G \
-p 7070:7070 \
-p 8088:8088 \
-p 50070:50070 \
-p 8032:8032 \
-p 8042:8042 \
-p 2181:2181 \
-p 1337:1337 \
--name kylin-4.0.0 \
apachekylin/apache-kylin-standalone:4.0.0


Setup the debugger
First, configure the kylin.sh filedocker exec -it container_id bash
- file path
/home/admin/apache-kylin-4.0.0-bin-spark2/bin/kylin.sh
- Under the
retrieveStartCommand()
function which is the start command function. line number 267

- Scroll down to line number 307, the line starts with the following
$JAVA ${KYLIN_EXTRA_START_OPTS} ${KYLIN_TOMCAT_OPTS}
- Add the following
-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=1337

- Restart the container
docker container restart container_id
- Login to Kylin, port is 7070. I’m using the docker ip, you can also use the localhost IP.
- Creds
admin:KYLIN


- Configure the debugger in Intellij IDEA




Reproduce the vulnerability
Based on the advisory, we will create a project with command injected e.g.touchpwned
and after that, we will dump the diagnosis information for the project, but while we are doing this we will modify it using burpsuite to trigger the command injection, therefore, triggering the exploit.



- Once you click “Diagnosis”, intercept the request

- Change the name touchpawned to
%60touch%20pawned%60
which the URL-encoded result of the following:`touch pawned`

- Now, check the container

Static Analysis & Debugging
NOTE: to run Kylin solution you run other apache solutions along with it, and this includes spark, Kafka, hbase, hive, spring …etc. therefore the debugging won’t be as detailed as usual because it will take it us into the source code of the other solutions.Find an entry point
Based on the advisory the vulnerability happens in dumpProjectDiagnosisInfo method, but I want to go through how it handles the request, how the project gets created, how the name got stored, and how the vulnerability gets triggered with the latest request we saw.- I searched for “projects” and found the “ProjectController.java”. This class here responsible for listing all projects, saving the project, updating the project, deleting the project, updating the project owner, and basically most of the project functions.
- I set a few breakpoints as you can see and I created a new project called “test1”, you can see this in
projectDescData
variable the values of the project.

Understand how the project gets created and saved
- So first time we create a project, the solution will use the
saveProject
method. Let’s go through this method real quick.
@RequestMapping(value = "", method = { RequestMethod.POST }, produces = { "application/json" })
: This line is an annotation that maps the method to the endpoint for creating a new project instance. It specifies that the endpoint should accept a POST request with an empty URL and that it should produce a JSON response.
@ResponseBody
: This annotation is used to indicate that the method’s return value should be written directly to the response body.
public ProjectInstance saveProject(@RequestBody ProjectRequest projectRequest)
: This line defines the method signature, which includes aProjectRequest
object as the request body and returns aProjectInstance
object.
if (StringUtils.isEmpty(projectDesc.getName()))
: This line checks whether thename
field of theProjectInstance
object is empty.
if (!ValidateUtil.isAlphanumericUnderscore(projectDesc.getName()))
: This line checks whether thename
field of theProjectInstance
object contains only alphanumeric characters and underscores.throw new BadRequestException(
: If thename
field does not contain only alphanumeric characters and underscores, aBadRequestException
is thrown.
ProjectInstance createdProj = null;
try {
createdProj = projectService.createProject(projectDesc);
} catch (Exception e) {
throw new InternalErrorException(e.getLocalizedMessage(), e);
}
This snippet here creates a new ProjectInstance object named createdProj
and sets it initially to null. It then tries to create a new project using a projectService
object and the projectDesc
parameter passed to the createProject method.
If the project creation is successful, the createdProj
object will be assigned the newly created project instance. If an exception is thrown during the project creation process, the catch block will be executed.
return createdProj;
: This line returns thecreatedProj
object, which contains the newly created project instance
How the diagnosis request get proceeded & how the command gets executed
- It all starts from the
dumpProjectDiagnosisInfo
method, set the breakpoints.
- Now click on “Diagnosis” in the website. you can always see variables and their values right there.
- The important line for me is the following
String filePath = dgService.dumpProjectDiagnosisInfo(project, diagDir.getFile());
- We have here the
dumpProjectDiagnosisInfo
, now follow this and you will find yourself inDiagnosisService.java
file

- Keep following with the debugger, now this is another interesting
String[] args = { project, exportPath.getAbsolutePath() };
This is an array named args
and it contains the project name along with the exportPath which is the diagnosis data path and it’s using the getAbsolutePath() method.
The getAbsolutePath() method is a part of the File class. This function returns the absolute pathname of the given file object.

- After that we see
runDiagnosisCLI(args)
takes the args array as input. - Step-in, and here is the
runDiagnosisCLI()
method, and we can see the args with the values right there.
File script = new File(KylinConfig.getKylinHome() + File.separator + "bin", "diag.sh");
This line of the method creates a new File
object representing a shell script named “diag.sh” located in the “bin” directory of the Kylin configuration directory.



BadRequestException
with a message that indicates the file could not be found.
- Now, we have diagCmd variable which has the script path and the args.

- Step-in, and click
getCliCommandExecutor()

- This will take you to
getCliCommandExecutor
and this method determines if it will get the remote access configuration of a Hadoop cluster or not to execute commands on it, i.e. remote commands. if the value retrieved is null in regards to the remote access configuration of the Hadoop cluster, and this is what happened in our case, the commands will be executed locally.

- You can see the value of
executor
returned

execute
method in the CliCommandExecutor
calls. both of the methods execute a shell command and return a Pair
object containing the exit code and output of the command.
We can see the first execute
method takes only one argument: String command
. Then, it calls the second execute
method with the same command
argument, along with a default logAppender
of new SoutLogger()
and a jobId
of null
.
The second execute
method takes the command
, a logAppender
(which is a logger instance that is used to log the output of the command), and a jobId
(which is an optional identifier that can be used to track the execution of the command).
The method then checks if a remote host has been specified for the CliCommandExecutor
instance. If not, it runs the command locally using the runNativeCommand
method, passing in the command
, logAppender
, and jobId
. This method executes the command using a ProcessBuilder
and captures the output and exit code of the command.
If a remote host has been specified for the CliCommandExecutor
instance, the execute
method instead runs the command on the remote host using the runRemoteCommand
method.
Finally, the method checks the exit code of the command. If the exit code is non-zero, the method throws an IOException
with an error message containing the exit code, error message, and command itself.


runNativeCommand
method since it’s the method that will execute the command.
Obviously, the code defines a private method runNativeCommand
which is called by the execute
method in the same class, and it executes a shell command using ProcessBuilder
and returns a Pair
object containing the exit code and output of the command.
The method takes three arguments: command
(which is the shell command to be executed), logAppender
(which is a logger instance that is used to log the output of the command), and jobId
(which is an optional identifier that can be used to track the execution of the command).
The method first constructs an array cmd
of strings, which contains the command and its arguments. The cmd
array is constructed differently depending on the operating system: for Windows, the command is executed using cmd.exe /C
, while for other operating systems (such as Linux or macOS), the command is executed using /bin/bash -c
.
Then, the method constructs a ProcessBuilder
instance using the cmd
array and sets the redirectErrorStream
property to true, which means that any error messages produced by the command will be redirected to the same output stream as the command’s standard output.
The method then starts the process using ProcessBuilder.start()
and registers it with a JobProcessContext
if a jobId
is provided.
The method then reads the command’s standard output line by line using a BufferedReader
, and appends each line to a StringBuilder
. For each line, if a logAppender
is provided, the line is logged using the Logger.log()
method.
If the method is interrupted by another thread (as determined by Thread.interrupted()
), it destroys the process and returns a Pair
object with an exit code of 1 and a message of “Killed”.
If the command execution completes successfully, the method waits for the process to exit using Process.waitFor()
and returns a Pair
object with the exit code and output of the command.
Finally, the method checks if the jobId
is not null removes the process from the JobProcessContext
.





runNativeCommand
is done.

r = runNativeCommand(command, logAppender, jobId);
and now it’s a matter of sending the command output back in the response.


How the execution looks like with an injected malicious payload
Since we understood in-depth how everything gets processed in the previous section, now I will just show screenshots of how it looks like with an injected malicious payload. Follow the same steps in the “Reproduce the vulnerability” section, but instead of sending the request through burpsuite. Send the request from the browser, so you can follow it in the debugger:

The root cause
I understood the root cause after the patch diffing. as it’s explained in the patch diffing, they replaced “project” with “projectName” and the reason is when you follow the debugger you will notice that “project” it’s just the name of the project name as it’s submitted (which is controlled by the user) after decoding. so when the attacker submits the malicious payload, the solution decodes it and passes it as it is a payload. The projectName it’s the real name with no characters or spaces.
ProjectManager.getInstance(KylinConfig.getInstanceFromEnv())
You will notice the projectName variable value


Patch Diffing
The fix link from here: https://github.com/apache/kylin/commit/f4daf14dde99b934c92ce2c832509f24342bc845#diff-5ca0e5634941e5810bc535c8084b3f11f9dce8cbb513500ec22db6a3a69ec930L97
Mitigation
Update Apache to the latest version.Final Thoughts
This software was a real joy, the dependency between multiple solutions makes it a little bit harder to debug, but I tried my best to make it focus on Apache Kylin only. How the payload gets structured in order to be injected it’s really interesting and fun.Resources:
- https://securitylab.github.com/advisories/GHSL-2021-1048_GHSL-2021-1051_Apache_Kylin/
- https://github.com/apache/kylin/commit/f4daf14dde99b934c92ce2c832509f24342bc845#diff-5ca0e5634941e5810bc535c8084b3f11f9dce8cbb513500ec22db6a3a69ec930L97
- https://kylin.apache.org/docs/install/index.html
- https://github.com/apache/kylin
About Version 2
Version 2 is one of the most dynamic IT companies in Asia. The company develops and distributes IT products for Internet and IP-based networks, including communication systems, Internet software, security, network, and media products. Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.
About Topia
TOPIA is a consolidated vulnerability management platform that protects assets in real time. Its rich, integrated features efficiently pinpoint and remediate the largest risks to your cyber infrastructure. Resolve the most pressing threats with efficient automation features and precise contextual analysis.