Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whether the codeql result contains function call chain information. #18796

Open
ysuLihua opened this issue Feb 17, 2025 · 3 comments
Open

Whether the codeql result contains function call chain information. #18796

ysuLihua opened this issue Feb 17, 2025 · 3 comments
Labels
question Further information is requested

Comments

@ysuLihua
Copy link

ysuLihua commented Feb 17, 2025

test code:

[root@*** test_case]# cat f_open_test_0.cpp
#include <cstdio>
#include<stdlib.h>
using namespace std;

const char * do_getenv1(const char * name) {
    return getenv(name);
}
[root@*** test_case]# cat f_open_test_a.cpp
#include "f_open_test_0.cpp"
extern const char * do_getenv1(const char *);

const char * do_getenv() {
    return do_getenv1("FILENAME1");
}


[root@*** test_case]# cat f_open_test_b.cpp 
#include "f_open_test_a.cpp"
extern const char * do_getenv();

int main(){
    const char * filename1 = do_getenv();
    FILE * file1 = fopen(filename1, "r");
}

running rule:
fopen-flow-from-getenv.ql

but results file don't contain the function call chain, results like this:

		"results": [{
			"ruleId": "cpp/huawei/member-variable-to-resource-leak",
			"ruleIndex": 0,
			"rule": {
				"id": "cpp/huawei/member-variable-to-resource-leak",
				"index": 0
			},
			"message": {
				"text": "This 'fopen' uses data from [call to 'getenv'](1)."
			},
			"locations": [{
				"physicalLocation": {
					"artifactLocation": {
						"uri": "f_open_test_b.cpp",
						"uriBaseId": "%SRCROOT%",
						"index": 0
					},
					"region": {
						"startLine": 6,
						"startColumn": 26,
						"endColumn": 35
					},
					"contextRegion": {
						"startLine": 4,
						"endLine": 8,
						"snippet": {
							"text": "int main(){\n    const char * filename1 = do_getenv();\n    FILE * file1 = fopen(filename1, \"r\");\n}\n\n"
						}
					}
				}
			}],
			"partialFingerprints": {
				"primaryLocationLineHash": "6f35e497d744c2f6:1",
				"primaryLocationStartColumnFingerprint": "21"
			},
			"relatedLocations": [{
				"id": 1,
				"physicalLocation": {
					"artifactLocation": {
						"uri": "f_open_test_0.cpp",
						"uriBaseId": "%SRCROOT%",
						"index": 1
					},
					"region": {
						"startLine": 6,
						"startColumn": 12,
						"endColumn": 18
					},
					"contextRegion": {
						"startLine": 4,
						"endLine": 7,
						"snippet": {
							"text": "\nconst char * do_getenv1(const char * name) {\n    return getenv(name);\n}\n"
						}
					}
				},
				"message": {
					"text": "call to 'getenv'"
				}
			}]
		}],
		"columnKind": "utf16CodeUnits",
		"properties": {
			"semmle.formatSpecifier": "sarif-latest"
		}
	}]
@ysuLihua ysuLihua added the question Further information is requested label Feb 17, 2025
@smowton
Copy link
Contributor

smowton commented Feb 17, 2025

Is the query of @kind problem? If so perhaps it should be converted to @kind path-problem. Then when exported to SARIF you should see a codeFlows element describing the path.

https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/ has general information about path-problem queries, or see the many standard path-problem queries for examples of how the data-flow library produces the flow-graph information needed.

@ysuLihua
Copy link
Author

Is the query of @kind problem? If so perhaps it should be converted to @kind path-problem. Then when exported to SARIF you should see a codeFlows element describing the path.

https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/ has general information about path-problem queries, or see the many standard path-problem queries for examples of how the data-flow library produces the flow-graph information needed.

Hi @smowton , thanks! The problem is resolved, but Why do multiple same location elements exist in the result file?

			"codeFlows": [{
				"threadFlows": [{
					"locations": [{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_0.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 1
								},
								"region": {
									"startLine": 6,
									"startColumn": 12,
									"endColumn": 24
								},
								"contextRegion": {
									"startLine": 4,
									"endLine": 7,
									"snippet": {
										"text": "\nconst char * do_getenv1(const char * name) {\n    return getenv(name);\n}\n"
									}
								}
							},
							"message": {
								"text": "*call to getenv"
							}
						}
					},
					{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_0.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 1
								},
								"region": {
									"startLine": 6,
									"startColumn": 12,
									"endColumn": 24
								},
								"contextRegion": {
									"startLine": 4,
									"endLine": 7,
									"snippet": {
										"text": "\nconst char * do_getenv1(const char * name) {\n    return getenv(name);\n}\n"
									}
								}
							},
							"message": {
								"text": "*call to getenv"
							}
						}
					},
					{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_0.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 1
								},
								"region": {
									"startLine": 5,
									"startColumn": 14,
									"endColumn": 24
								},
								"contextRegion": {
									"startLine": 3,
									"endLine": 7,
									"snippet": {
										"text": "using namespace std;\n\nconst char * do_getenv1(const char * name) {\n    return getenv(name);\n}\n"
									}
								}
							},
							"message": {
								"text": "**do_getenv1"
							}
						}
					},
					{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_a.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 2
								},
								"region": {
									"startLine": 5,
									"startColumn": 12,
									"endColumn": 22
								},
								"contextRegion": {
									"startLine": 3,
									"endLine": 7,
									"snippet": {
										"text": "\nconst char * do_getenv() {\n    return do_getenv1(\"FILENAME1\");\n}\n\n"
									}
								}
							},
							"message": {
								"text": "*call to do_getenv1"
							}
						}
					},
					{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_a.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 2
								},
								"region": {
									"startLine": 5,
									"startColumn": 12,
									"endColumn": 22
								},
								"contextRegion": {
									"startLine": 3,
									"endLine": 7,
									"snippet": {
										"text": "\nconst char * do_getenv() {\n    return do_getenv1(\"FILENAME1\");\n}\n\n"
									}
								}
							},
							"message": {
								"text": "*call to do_getenv1"
							}
						}
					},
					{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_a.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 2
								},
								"region": {
									"startLine": 4,
									"startColumn": 14,
									"endColumn": 23
								},
								"contextRegion": {
									"startLine": 2,
									"endLine": 6,
									"snippet": {
										"text": "extern const char * do_getenv1(const char *);\n\nconst char * do_getenv() {\n    return do_getenv1(\"FILENAME1\");\n}\n"
									}
								}
							},
							"message": {
								"text": "**do_getenv"
							}
						}
					},
					{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_b.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 0
								},
								"region": {
									"startLine": 5,
									"startColumn": 30,
									"endColumn": 39
								},
								"contextRegion": {
									"startLine": 3,
									"endLine": 7,
									"snippet": {
										"text": "\nint main(){\n    const char * filename1 = do_getenv();\n    FILE * file1 = fopen(filename1, \"r\");\n}\n"
									}
								}
							},
							"message": {
								"text": "*call to do_getenv"
							}
						}
					},
					{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_b.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 0
								},
								"region": {
									"startLine": 5,
									"startColumn": 30,
									"endColumn": 39
								},
								"contextRegion": {
									"startLine": 3,
									"endLine": 7,
									"snippet": {
										"text": "\nint main(){\n    const char * filename1 = do_getenv();\n    FILE * file1 = fopen(filename1, \"r\");\n}\n"
									}
								}
							},
							"message": {
								"text": "*call to do_getenv"
							}
						}
					},
					{
						"location": {
							"physicalLocation": {
								"artifactLocation": {
									"uri": "f_open_test_b.cpp",
									"uriBaseId": "%SRCROOT%",
									"index": 0
								},
								"region": {
									"startLine": 6,
									"startColumn": 26,
									"endColumn": 35
								},
								"contextRegion": {
									"startLine": 4,
									"endLine": 8,
									"snippet": {
										"text": "int main(){\n    const char * filename1 = do_getenv();\n    FILE * file1 = fopen(filename1, \"r\");\n}\n\n"
									}
								}
							},
							"message": {
								"text": "*filename1"
							}
						}
					}]
				}]
			}]

rule content:

/**
 * @name member variable to open or create resrouce
 * @description Use a member variable to open or create resource. maybe unrelease and detect resource leaks.
 * @kind path-problem
 * @id cpp/huawei/member-variable-to-resource-leak
 * @problem.severity warning
 * @security-severity 7.8
 * @tags efficiency
 *       security
 *       external/cwe/cwe-404
 */

import cpp
import semmle.code.cpp.dataflow.new.DataFlow

module EnvironmentToFileConfig implements DataFlow::ConfigSig {
  predicate isSource(DataFlow::Node source) {
    exists(Function getenv |
      source.asIndirectExpr(1).(FunctionCall).getTarget() = getenv and
      getenv.hasGlobalName("getenv")
    )
  }

  predicate isSink(DataFlow::Node sink) {
    exists(FunctionCall fc |
      sink.asIndirectExpr(1) = fc.getArgument(0) and
      fc.getTarget().hasGlobalName("fopen")
    )
  }
}

// module EnvironmentToFileFlow = DataFlow::Global<EnvironmentToFileConfig>;
module Flow = DataFlow::Global<EnvironmentToFileConfig>;
import Flow::PathGraph

from Expr getenv, Expr fopen, Flow::PathNode source, Flow::PathNode sink
where
  source.getNode().asIndirectExpr(1) = getenv and
  sink.getNode().asIndirectExpr(1) = fopen and
  Flow::flowPath(source, sink)
select fopen, source, sink, "open file by tainted data"

@smowton
Copy link
Contributor

smowton commented Feb 18, 2025

It's because at a technical level the data-flow graph has more than one node representing the call, which are both attributes to the same location in source. I'll ask the C/C++ team if it's possible to suppress one in the path description; however this isn't specific to this particular query, so there's nothing wrong with your CodeQL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants