Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception Message truncation isn't so friendly #2185

Closed
deleugpn opened this issue Oct 31, 2018 · 33 comments
Closed

Exception Message truncation isn't so friendly #2185

deleugpn opened this issue Oct 31, 2018 · 33 comments

Comments

@deleugpn
Copy link
Contributor

Hi everyone,

I know this has been extensively discussed here #1722 but unfortunately the exception truncation is a constant pain-point for me to the point that I'm almost maintaining a fork of Guzzle just to change that number. I understand a few of the points made in the thread such as the uncertainty of the size of the content in the response. A counter-argument that I would like to raise is the fact that we are forced into treating Guzzle Exceptions differently in our systems because Guzzle won't let me see enough of the response content. And then I have to turn this

$client->get($resource);

into this

try {
    $client->get($resource);
} catch(BadResponseException $e) {
    $e->getResponse()->getBody()->getContents();
}

throughout my entire platform. Sometimes we include packages that make usage of Guzzle and the package won't let me see the full error message as well.

This seem such a small issue that brings a lot of headache. an extremely simple static Client::MESSAGE_SIZE = 6400; would suffice for me and probably for most of the people suffering from this. The pain-point for me is located on 5xx requests and 4xx requests between internal microservices. Those are not production-ready code that I'm working on and trying to stabilize them in a way that I'll never have a 5xx or 4xx, but I don't know what to fix because I can't see the error message.

Something else described on the linked thread was this:

Obviously you don't want to log that 10k JSON, so in this case you should extract the necessary/useful information from the body, and properly log it instead of the exception message.

Why Obviously I don't want to log that 10k JSON? What if I do? Even if I don't need the 10k, 120 characters is still not enough for me to see which table name does not exist from a standard MySQL error.

Anyway, I would really hope that this could be revisited. I would like to finish by saying that what Guzzle is doing right now is forcing me to catch an exception not because I should catch it, but because I cannot see the exception's message if I don't. I would like to be able to only catch exceptions that I really need to catch and let all other exceptions be reported.

Thanks for the this amazing tool.

@gmponos
Copy link
Member

gmponos commented Oct 31, 2018

  1. Although I am not a member of this repository I can't help to notice that I believe that a new issue was not required since you found the old one. You could just comment there.
  2. Personally I dislike the truncate that happens there but from a different point of view. I believe that a title on the exception should have been enough.

https://github.com/guzzle/guzzle/blob/master/src/Exception/RequestException.php#L105

        $message = sprintf(
            '%s: `%s %s` resulted in a `%s %s` response',
            $label,
            $request->getMethod(),
            $uri,
            $response->getStatusCode(),
            $response->getReasonPhrase()
        );

Only the above should have been enough without the summary. If the developers wanted more info they can push a log middleware to the stack of guzzle and get the info they require.

@gmponos
Copy link
Member

gmponos commented Oct 31, 2018

In the same way (without summary) is also applied to httplug https://github.com/php-http/httplug/blob/v2.0.0/src/Exception/HttpException.php#L53

@sagikazarmark
Copy link
Member

Why Obviously I don't want to log that 10k JSON?

I know it's already 2018, but people still make the mistake of storing logs on servers with limited disks without rotation. If I have to modify that statement, I would say logging the body of HTTP responses in an uncontrolled manner poses a very real and horrifying risk. From a maintainer point of view I have to be on the defensive side of things on this one.

A configuration option could be a solution, but the thing is it does not protect you from the above thing if you just globally turn that option into infinity.

That being said, we might consider adding an option.

Till then, here are a few (better) options:

  1. I strongly believe that the proper way to put the body into the exception is catching it and manually returning another exception. That's just simply superior to anything, because you consciously decide how to handle errors.
  2. Guzzle exposes an interface that it implements. Unfortunately it's not used widely, but implementing that custom interface and doing whatever you want with the exception and then rethrowing it can be a solution.
  3. Directly depending on Guzzle is an architecture mistake anyway, so you should implement a wrapper interface of your own (or use something like HTTPlug/PSR-18). Again, inside that interface, you do whatever you want.
  4. You can create a middleware, catch the interface and then rehtrow it in a single point.

Ultimately I believe this issue only occurs because people don't use Guzzle in a way they should. The above solutions should all provide a solution. As mentioned before, security MUST be a first priority in this case. That said, I will leave this issue open and consider adding an option, but this isn't a high priority issue for now. I hope it helps.

@deleugpn
Copy link
Contributor Author

For me, 1) is not a solution. It's essentially the problem. I already have a exception handler library attached to a logging library that will log any exception for me. The library won't treat guzzle exceptions differently and I don't want to fork the library and do that.
I also don't want to catch guzzle exceptions because I believe I should catch it if I want to handle it. I should not be forced to catch an exception just to see the message.

  1. is not a solution for the same reason as 1. Every exception class has a method called getMessage and an http exception message is in the body. The interface could have been a solution if Guzzle offered an inversion of control container. I would be able to implement my custom exception class and then tell Guzzle to use my class instead of the default one.

3 is unnecessary overhead. Yes, I would be able to have the try catch and rethrow in only one place by doing 3 and I might try to do that. Unfortunately I see it as a tremendous overhead because it brings 0 benefit for my needs if Guzzle could just let me see the exception message in my logging system.

I haven't pursued 4) because from the linked issue (#1722) I understood you were against doing exactly that. I will try to learn more about Guzzle Middleware to try and accomplish that.

For your last statement, if you think people should always catch Guzzle exception if they want to see the message, then I have to agree with you that people don't use Guzzle correctly. Honestly, I don't want to use Guzzle like that. I understand and respect the decision for security first, but dictating that getting the response body is unsafe in any situation is just incorrect.
I know a lot of people can misuse the ability to configure the size and end up with a flaw, but that's an explicit developer decision and Guzzle should not be concerned about that.
I would even argue that you could still limit the configuration saying nobody can instruct Guzzle to decode more than 512KB worth of data. But 120 characters isn't enough for most MySQL error messages, which are the most basic error messages there is.

@trevorgehman
Copy link

trevorgehman commented Feb 14, 2019

Just wanted to chime in here since this issue is still open. Ran into this problem myself recently when dealing with the MailChimp API, which returns errors like this:

{"type":"http://developer.mailchimp.com/documentation/mailchimp/guides/error-glossary/","title":"Invalid Resource","status":400,"detail":"The \"name\" parameter must be unique but a segment named \"XXXXXXX\" already exists.","instance":"0fac1777-b697-4f65-89e9-xxxxxxxxxx"}

With this truncating, I only get this:

Client error: 'PATCH https://us18.api.mailchimp.com/3.0/lists/xxxxxxxxx/segments/xxxxx' resulted in a '400 Bad Request' response{"type":"http://developer.mailchimp.com/documentation/mailchimp/guides/error-glossary/","title":"Invalid Resourc ... (truncated)

I'm using Laravel and Bugsnag, so error handling and reporting is pretty much out of the box. I tried catching the Guzzle RequestException in the handler, but then when I throw a new exception, I lose the trace.

So my only option at this point is to wrap all the Guzzle requests throughout my app, or at least in the places I anticipate needing a longer message body. It seems like a lot of work, when a simple increase in the truncated limit from 120 to 240 would be sufficient.

@deleugpn
Copy link
Contributor Author

You get it!

@realtebo
Copy link

Is there any workaround ?
I read about 'middleware' for Guzzle. Could someone write down an example ?

@landsman
Copy link

landsman commented Sep 13, 2019

Exactly! This is really annoying.

Hardware and storage do not cost nothing today, but I can undestand that need for somebody. So let's do that by configuration for easier using by other people. For example env variable loading would be great.

We have to do this 🤦‍♂️

<?php declare(strict_types=1);

namespace Trisbee\Shoptet\Trisbee;

use GuzzleHttp\ClientInterface;
use GuzzleHttp\Exception\RequestException;
use Trisbee\Shoptet\Trisbee\Exception\ApiException;

class HttpClient
{


    /**
     * @var ClientInterface
     */
    private $client;

    public function __construct(ClientInterface $client)
    {
        $this->client = $client;
    }

    public function post($uri, $options = [])
    {
        return $this->sendRequest(function ($uri, $options) {
            return $this->client->post($uri, $options);
        }, $uri, $options);
    }

    public function get($uri, $options = [])
    {
        return $this->sendRequest(function ($uri, $options) {
            return $this->client->get($uri, $options);
        }, $uri, $options);
    }

    public function patch($uri, $options = [])
    {
        return $this->sendRequest(function ($uri, $options) {
            return $this->client->patch($uri, $options);
        }, $uri, $options);
    }

    public function put($uri, $options = [])
    {
        return $this->sendRequest(function ($uri, $options) {
            return $this->client->put($uri, $options);
        }, $uri, $options);
    }

    public function delete($uri, $options = [])
    {
        return $this->sendRequest(function ($uri, $options) {
            return $this->client->delete($uri, $options);
        }, $uri, $options);
    }

    private function sendRequest(callable $callFunction, $uri, $options)
    {
        try {
            return $callFunction($uri, $options);
        } catch (RequestException $e) {
            if ($e->getResponse() !== null) {
                throw new ApiException('Api error: ' . $e->getResponse()->getBody()->getContents(), $e->getCode(), $e);
            }

            throw $e;
        }
    }
}

@adrianrudnik
Copy link

We use Guzzle for functional tests and a truncated message is as useful as a simple "Client reported 400 Bad request". Wrapping the HTTP client again just to get a clear error message in tests is just one more part that can break.

I'm a bit surprised because I fell for this again. This is such an unintuitive behavior, either be generic with the exception message and wrap it up or give me everything in the message.

Is there any reason why the PHP setting log_errors_max_len is not enough?

@sagikazarmark
Copy link
Member

sagikazarmark commented Nov 28, 2019

I explained a solution I would accept in the form of a PR here: #2272 (comment)

So far none of the proposed solutions passed the bar for a secure enough alternative, so no PRs were accepted. If someone is willing to implement the above proposed/similar solution I would most certainly accept and merge it.

Is there any reason why the PHP setting log_errors_max_len is not enough?

Various logging solutions might not respect this setting.

@adrianrudnik
Copy link

@sagikazarmark The PR looks OK form certain perspective, but I'm not at that point sadly. Regarding the various logging solutions you are right, but its their space of responsibility, so they should do truncation. Why would Guzzle put itself in the responsibility of logging? It's neither what is was build for nor specializes in.

@sagikazarmark
Copy link
Member

Unfortunately exceptions end up in all sorts of places and there is no guarantee that a logger will truncate the output.

This topic has been debated several times, please read the linked issues and PRs.

Basically, HTTP message body being part of the exception message is bad to begin with. HTTP exceptions shouldn't even exist. Users should handle errors, but people are lazy.

Also, there is no guarantee that an HTTP body returns a human readable message. It's 2019: we have protobuf and thrift.

There are tons of other reasons, why dumping the HTTP message body into the exception message is a terrible idea, but this is what the community wants, so it's there. That said, it's a potential security issue, so truncation was introduced.

@deleugpn
Copy link
Contributor Author

Yeah, doesn't matter how bad it is the exception message now, your best bet is to use another http client.

@adrianrudnik
Copy link

adrianrudnik commented Nov 28, 2019

@sagikazarmark yeah, I agree with that, but as I said before: Outputting nothing from the response would be a more streamlined and understandable approach as it gives the impression that something truncated it and the last thing one would expect that it's the http lib that you cant configure. Just force users to catch the exception (or handle the error manually). Instead this lib almost the single hit for "php truncated exception" searches on google. It's just confusing and thats the single thing why so many issues arise.

Having a preview of the first x characters does not help 99% of the time and just raises issues here, is not protocol safe anyway and if security is at risk, do not even attempt to solve it for the user by truncating stuff.

I'm pro the idea to just give an exception without any response details, server response code should be enough and force the programmer to always get the response content, instead of searching of ways how this lib could be improved doing stuff its not responsible for.

@sagikazarmark
Copy link
Member

sagikazarmark commented Nov 28, 2019

@adrianrudnik Here is (one of) the solution(s) I proposed earlier:

Given this is only a problem in the http_error middleware, I would prefer this to be an option in that middleware, probably with a RequestException::createWithResponseSize named constructor. It would require you to instantiate the middleware for yourself, so there is now way other than the owner of the client can change that size limit.

This allows you to override the default limit. All you need to do is write your own factory function for Guzzle, that always configures the middleware with your limit (which could be 0 for no limit, maybe -1 to no body at all?).

What do you think about that? Alternatively, this could be a RequestException::dumpBodyInMessage function as well which enables the feature.

@superbiche
Copy link

Would an env var be an option?
Something like GUZZLE_ anything, as I suppose no other script in the world would use this prefix, it's as easy to implement (check, not present = default) and would give great flexibility to everyone?

If one sets this var and has memory or any type of issue that's one's problem, these are a few lines of code and would help in a way that words can't suffice to explain

@landsman
Copy link

Is this on some kind of roadmap?

@gmponos
Copy link
Member

gmponos commented May 18, 2020

Checking @mtdowling comment here... I am trying to think what would have happened if we allowed to log all the exception message and at the same time an issue exists like this (check the related PR here for even more info).

Although the above linked issue seems as a bug.. and therefore it must be fixed.. the point is that allowing to log all the response message makes it unsafe.

Friendly speaking... this is not under any roadmap.. and I don't see it coming any time soon :)

@pliszka2
Copy link

Hit the same issue....

FOUND perfect and simple WORKAROUND!

We could use: 'http_errors' => false,

$guzzleClient->post( $url, [ 'headers' => $headers, 'http_errors' => false, 'json' => $json, ] );

It's looks like it works with all request GET, POST, PUT, DELETE!

Now you can simply use "$response->getStatusCode()" and "$response->getBody()->getContents()"

@Pavel-Tsymbal
Copy link

Solution for me:

try {
//some code
} catch(BadResponseException $e) {
dd(json_decode($e->getResponse()->getBody()->getContents(), true));die;
}

Снимок экрана 2020-09-23 в 21 59 03

@Nyholm
Copy link
Member

Nyholm commented Oct 10, 2020

In guzzle 7.2.0 you may configure how the exception messages are truncated.

@Nyholm Nyholm closed this as completed Oct 10, 2020
@allan-simon
Copy link

@Nyholm this ticket being the one referenced by google when looking for truncate issue, can you put here a link to the documentation explaining how to do this, thanks a lot :)

@Nyholm
Copy link
Member

Nyholm commented Dec 11, 2020

See #2795

I would be happy to review a PR adding this to the documentation.

@PhilippGrashoff
Copy link

reading #2795, I can't figure how to get it working. The standard behaviour in 7.2.0 is still to truncate. Any hint where to change something to get the full error message?
My current Guzzle usage looks always more or less like this, is it enough to add some parameter to the constructor?:

        $this->guzzle = new Client(
            [
                'base_uri' => $this->apiBaseUrl,
                'headers' => [
                    'Authorization' => 'Bearer ' . $this->accessToken,
                    'Accept' => '*/*'
                ]
            ]
        );
        
        $result = $this->guzzle->get('someUrl');

@dhampik
Copy link

dhampik commented Mar 16, 2021

It looks like this is how to avoid truncation in Guzzle 7.2:

$stack = new HandlerStack(Utils::chooseHandler());
$stack->push(Middleware::httpErrors(new BodySummarizer(PHP_INT_MAX)), 'http_errors');
$stack->push(Middleware::redirect(), 'allow_redirects');
$stack->push(Middleware::cookies(), 'cookies');
$stack->push(Middleware::prepareBody(), 'prepare_body');
$client = new Client(['handler' => $stack /* other options here */ ]);

The above code is based on HandlerStack::create. Note, that if the list of default middlewares changes then this code needs to be updated. I would say that we should re-open this issue as it is still not fully solved.

If anyone knows how to get rid of the truncation in a simpler way (not counting $exception->getResponse()->getBody()), I would also appreciate, if you share this information (there is nothing in the documentation, btw).

Main problem is that even though BodySummarizer has the option to set $truncateAt to null, the \GuzzleHttp\Psr7\Message::bodySummary still uses 120 as default. Not sure, if that's intentional or not. Maybe @GrahamCampbell can comment on this? It is not obvious at least. I would expect to not have any truncation when $trancateAt is null in the BodySummarizer, but that's not possible without changing the truncation code in \GuzzleHttp\Psr7\Message::bodySummary.

@pculka
Copy link

pculka commented Apr 27, 2021

2021 and I'm still thinking. WHY on earth there is not a simple parameter to disable message truncation.
I really like Guzzle. It solves quite a lot of problems. But at least for development purposes I just want to say: "Leave that decision for the developers"...

@Nyholm
Copy link
Member

Nyholm commented Apr 27, 2021

Hey @pculka,

Your message is neither inspiring nor productive. It is also clear to me that you have not read this thread before posting. The issue was fixed last year.

If you want to show appreciation for the thousands of hours spent on maintaining guzzle over the past decade, you may help by sending a PR to improve the documentation around this issue.

@pculka
Copy link

pculka commented Apr 27, 2021

Hi @Nyholm,

I've read through the discussion, though piping a desired length is not a developer solution per se. Disabling truncation is something that was not implemented. Meaning I do not consider that a "fix", but merely an "override".

Just try to put "disable truncation guzzle" into a search engine of your choice and check how many results you come up with. And that is an issue that needs to be adressed, which the OP of this thread did.

Also, documentation lacks sufficient information about this issue. Just try to go to https://docs.guzzlephp.org/ - which is the official documentation website AFAIK and try to search for "truncated". How many results did you find?

@Nyholm
Copy link
Member

Nyholm commented Apr 27, 2021

Disabling truncation is something that was not implemented.

It is. There is even a BodySummarizerInterface which allows you to truncate at any length, no truncate or remove every vowel if you so desire.

Also, documentation lacks sufficient information about this issue.

I know, that is why I suggested you to update the documentation.

@allan-simon

This comment has been minimized.

@yannoff
Copy link

yannoff commented Jul 1, 2021

Hi guys,

So is anyone satisfied by that "middle-ware" options (#2795) ? Because that's not what I would call a solution, definitely not.

To me it sounds way too complicated. This is a far more complex workaround than catching the RequestException to fetch the response body contents as mentioned in the first place.

Whereas a simple error_message_maxlenght configuration option would be simpler to use (and, for what I've seen, to implement in guzzle too) developpers willing to have full error messages must reinvent the wheel by rewriting the whole middle-ware stack, in other words creating a custom guzzle implementation, that's just a non-sense.

@Kehza
Copy link

Kehza commented Sep 7, 2021

So I've just come across this issue too and find it crazy that you can't just specify a property to override this.. My Solution? Edit the library and make a note that it needs to be updated on each install sigh

@guzzle guzzle locked as too heated and limited conversation to collaborators Sep 7, 2021
@sagikazarmark
Copy link
Member

I'm locking the conversation, because trashing the maintainers and letting the frustration out here is not going to help anybody.

To summarise the issue:

  • the default truncation is not going away
  • there IS a way to dump entire HTTP message bodies (even if people don't like that it's not the default)

How you can improve the situation:

  • Provide evidence that the solution doesn't work (ie. doesn't allow you to dump entire HTTP message bodies, has bugs, etc). (Having to write 10-20 extra LOC to setup Guzzle doesn't count)
  • Send PRs to the documentation

It's impossible to make everyone happy, so please be professional: accept the situation and move on.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests