Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: new(build,userspace): switch to use container plugin #3482

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

FedeDP
Copy link
Contributor

@FedeDP FedeDP commented Feb 4, 2025

What type of PR is this?

/kind cleanup
/kind feature

Any specific area of the project related to this PR?

/area build
/area engine

What this PR does / why we need it:

This PR bumps Falco to HEAD of falcosecurity/libs#2207, enforcing the usage of the container plugin.
Keeping it wip until all issues are resolved.

TODO:

  • fix CI issues -> test_falco_engine.extra_format_do_not_replace_container_info
  • the config now loads the container plugin by default. But the container plugin itself is not supported on windows and osx (and musl of course) therefore we either use a new falco_others.yaml config (that we install on windows,osx and musl in place of falco.yaml), or we use something like load_plugins: [@FALCO_DEFAULT_PLUGINS@] and customize FALCO_DEFAULT_PLUGINS from within cmake -> we decided to add the plugin configuration in the main falco.yaml file but then install an override config file that loads it only on non-musl linux installations
  • what to do with musl? We won't have any way to runtime load the plugin... So the musl build won't support containers anymore -> not a big deal for now; eventually if user demand is high, we might want to statically link the plugin somehow
  • fixup Falco yaml container_engines: key: drop in favor of plugin config -> implement TODOs or drop? It is not sandbox level unfortunately, but Incubating thus theoretically we'd need a deprecation period. -> before 1.0.0 incubating features can be dropped without deprecation period: https://github.com/falcosecurity/falco/blob/master/proposals/20231220-features-adoption-and-deprecation.md#before-falco-10
  • Update rules? They will need to require container plugin

While it should be up to falcoctl to install the container plugin, since it was previously a core feature of libraries, we decided to keep it installed by the Falco package so that host installations through packages won't lose such a feature by default (ie: without playing with falcoctl to install the plugin).
Another solution could be to let rpm/deb install scripts leverage falcoctl to install the plugin, BUT the tar.gz tarball would still be hit by the feature loss.

Which issue(s) this PR fixes:

Refs #3403

Special notes for your reviewer:

The cares,grpc,openssl and curl cmake files were copy pasted from libs repo.

Does this PR introduce a user-facing change?:

new(build,userspace): switch to use container plugin

@poiana
Copy link
Contributor

poiana commented Feb 4, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 10, 2025

/milestone 0.41.0

@poiana poiana added this to the 0.41.0 milestone Feb 10, 2025
Copy link

This PR may bring feature or behavior changes in the Falco engine and may require the engine version to be bumped.

Please double check userspace/engine/falco_engine_version.h file. See versioning for FALCO_ENGINE_VERSION.

/hold

@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 13, 2025

Going to remove the wip commit once falcosecurity/testing#69 gets merged.

EDIT: removed

@FedeDP FedeDP force-pushed the new/container_plugin branch from 624544b to 03e2350 Compare February 13, 2025 15:28
@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 13, 2025

Now the arm64 test-dev-packages is down to 4 errors, 2 of them are related to the usage of the full branch name for libs and driver deps; they will be fixed once we properly use a commit hash.

The remaining 2 are actually linked to the container plugin: it seems replaying a capture file we are seeing 2 container events instead of 3:

  TestFalco_Legacy_ContainerPrivileged

{"deadline":180000000000,"level":"info","msg":"running falco with runner","time":"2025-02-13T16:41:57Z"}
{"cmd":"/usr/bin/falco -c /etc/falco/falco.yaml -o append_output.suggested_output=false -o json_output=true -r falco_rules.yaml -o engine.kind=replay -o engine.replay.capture_file=container-privileged.scap -A -o json_include_output_property=false -o json_include_tags_property=false -o log_level=debug -o log_stderr=true -o log_syslog=false -o stdout_output.enabled=true","level":"debug","msg":"executing command","time":"2025-02-13T16:41:57Z"}
    legacy_test.go:2498: 
        	Error Trace:	/home/runner/work/_actions/falcosecurity/testing/main/legacy_test.go:2498
        	Error:      	Not equal: 
        	            	expected: 3
        	            	actual  : 2
        	Test:       	TestFalco_Legacy_ContainerPrivileged

  TestFalco_Legacy_ContainerSensitiveMount

{"deadline":180000000000,"level":"info","msg":"running falco with runner","time":"2025-02-13T16:41:56Z"}
{"cmd":"/usr/bin/falco -c /etc/falco/falco.yaml -o append_output.suggested_output=false -o json_output=true -r falco_rules.yaml -o engine.kind=replay -o engine.replay.capture_file=container-sensitive-mount.scap -A -o json_include_output_property=false -o json_include_tags_property=false -o log_level=debug -o log_stderr=true -o log_syslog=false -o stdout_output.enabled=true","level":"debug","msg":"executing command","time":"2025-02-13T16:41:56Z"}
    legacy_test.go:2517: 
        	Error Trace:	/home/runner/work/_actions/falcosecurity/testing/main/legacy_test.go:2517
        	Error:      	Not equal: 
        	            	expected: 3
        	            	actual  : 2
        	Test:       	TestFalco_Legacy_ContainerSensitiveMount

EDIT: on x86 we also have a couple of ASSERT triggering in libs code:

falco: /home/runner/work/falco/falco/build/falcosecurity-libs-repo/falcosecurity-libs-prefix/src/falcosecurity-libs/userspace/libsinsp/threadinfo.cpp:782: sinsp_fdinfo* sinsp_threadinfo::add_fd(int64_t, std::unique_ptr): Assertion `false' failed.

@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 14, 2025

The failing tests come from the fact that we are not able to extract container fields from the container events. Will look into it.

Indeed, we were previously setting tinfo on the generated container json event: https://github.com/falcosecurity/libs/blob/master/userspace/libsinsp/container.cpp#L287; i am trying to understand whether this is reproducible with a plugin ASYNCEVENT.

EDIT: fixed; now only the x86 ASSERT fails remain:

falco: /home/runner/work/falco/falco/build/falcosecurity-libs-repo/falcosecurity-libs-prefix/src/falcosecurity-libs/userspace/libsinsp/threadinfo.cpp:782: sinsp_fdinfo* sinsp_threadinfo::add_fd(int64_t, std::unique_ptr): Assertion `false' failed.

@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 17, 2025

why only bpf and kmod are hit?

It seems modern ebpf is hit too:

grep podman modern.txt 
NULL fd_table_ptr: 'podman' 'podman --log-level=info system service'

After adding:

sinsp_fdinfo* sinsp_threadinfo::add_fd(int64_t fd, std::unique_ptr<sinsp_fdinfo> fdinfo) {
	sinsp_fdtable* fd_table_ptr = get_fd_table();
	if(fd_table_ptr == NULL) {
		printf("NULL fd_table_ptr: '%s' '%s'\n", m_comm.c_str(), m_cmd_line.c_str());
		return NULL;
	}

why the very same thing does not happen using libs master?

Because the culprit is the podman service started by the podman SDK withing the go-worker of the plugin. Thus no plugin, no issue.

I added a commit on top if my libs PR to drop the ASSERT(). Note that get_fd_table is used multiple times inside threadinfo.cpp and the most recent methods using it, already did not use the ASSERT. That's because it can happen that the get_fd_table returns null, given its impl:

inline sinsp_fdtable* get_fd_table() {
		if(!(m_flags & PPM_CL_CLONE_FILES)) {
			return &m_fdtable;
		} else {
			sinsp_threadinfo* root = get_main_thread();
			return (root == nullptr) ? nullptr : &(root->get_fdtable());
		}
	}

Probably there was a time where we always returned &m_fdtable (ie: when we never enforced the PPM_CL_CLONE_FILES flag. Nowadays, we are always enforcing that flag, as stated in the comment above the function:

/* Note that fd_table should be shared with the main thread only if PPM_CL_CLONE_FILES
* is specified. Today we always specify PPM_CL_CLONE_FILES for all threads.
*/

It will be shipped by default hence it is present in default config.

Signed-off-by: Federico Di Pierro <[email protected]>
Also, default falco.yaml will only host container plugin configuration but won't enable the plugin.
Instead, a configuration override file will be installed only on linux non-musl deployments, enabled the plugin.

Signed-off-by: Federico Di Pierro <[email protected]>
… string.

Also, remove `container_id container_name` fields from `-pc` output.
These fields are now automatically appended since the `container` plugin
marks them as suggested.

Signed-off-by: Federico Di Pierro <[email protected]>
@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 17, 2025

1 failure left on x86_64:

  • TestFalco_Legacy_KernelUpgrade

No idea why this is failing.

2 failures on arm64:

  • TestFalcoLegacyBPF
  • TestFalcoKmod

Both of them are present in https://github.com/falcosecurity/falco/actions/runs/13330903373 too (ie: they do not seem to depend on the container work).

Many failures on x86_64 musl; that's because we cannot load the plugin on musl, as already stated in the PR body. I think we will just need to remove the required check on it (or just stop testing musl build).

@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 18, 2025

falcosecurity/testing#71 fixes arm64 kmod and legacyBpf failures.
Still investigating x86_64 TestFalco_Legacy_KernelUpgrade.

@FedeDP FedeDP force-pushed the new/container_plugin branch from 5d0b262 to bf07203 Compare February 18, 2025 08:58
@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 18, 2025

/reopen

@poiana
Copy link
Contributor

poiana commented Feb 18, 2025

@FedeDP: Failed to re-open PR: state cannot be changed. The new/container_plugin branch was force-pushed or recreated.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 18, 2025

/reopen

@poiana poiana reopened this Feb 18, 2025
@poiana
Copy link
Contributor

poiana commented Feb 18, 2025

@FedeDP: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@FedeDP FedeDP force-pushed the new/container_plugin branch 2 times, most recently from dec8857 to c127177 Compare February 18, 2025 10:50
Container plugin cannot be dynamically loaded on musl build, therefore
some falcosecurity/testing tests are failing on it.

Signed-off-by: Federico Di Pierro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

2 participants