Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering the austin output? #245

Open
LaurensBosscher opened this issue Jan 7, 2025 · 1 comment
Open

Filtering the austin output? #245

LaurensBosscher opened this issue Jan 7, 2025 · 1 comment

Comments

@LaurensBosscher
Copy link

Description

First of all, thanks for this project! This is a very useful tool that absolutely fills a gap in the Python ecosystem.

As I've started using this, I've ran into a few scenarios where the user experience is not optimal. E.g

I have a test for a Python function that:

  • Has a nested loop
  • Uses a Pandas call within that loop

This creates an enormous output file with a lot of information that is not directly relevant (e.g internal Pandas calls). The size of the output makes it hard(er) to work with this file and to find the lines in my code that are slow to execute.

Desired scenario

I would like to increase the signal:noise ratio in the output file and have a smaller file that's easier (and faster) to load. Possible solutions could be:

  1. Ignoring C calls
  2. Specifying a max stack depth
  3. Excluding certain directories (e.g .venv)
  4. Including only specific directories (e.g project/src)
  5. Filtering the output file using sed
@P403n1x87
Copy link
Owner

@LaurensBosscher thanks for your interest in Austin. I'm glad you find it useful. The main reason why the Austin interface is small is because I wanted to make sure that all the resources available are used for sampling frame stacks. Therefore, any processing of the collected samples is pushed out of Austin. And where it is pushed into is libraries/tools like austin-python. The solution won't probably be as optimal as you'd want it from what I understand (you would still end up with a large file from Austin; if you have collected the data in binary format you might get a 4x/6x compression on average). You would still have to craft some custom tool to do what you need, with the hope that the library would make it slightly easier to handle the output file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants