-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print non-printable characters using caret notation #2443
Print non-printable characters using caret notation #2443
Conversation
When this flag is set, non-printable characters are printed using caret notation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks sensible on first glance! Before a detailed review, please also add regression tests to integration_tests.rs
src/preprocessor.rs
Outdated
'\x1B' => output.push_str("^["), | ||
|
||
// everything else | ||
c => output.push_str(&c.escape_unicode().collect::<String>()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce the diff and levels of indentation I would prefer if you added if use_caret
to each existing character arm instead is of doing the "refactoring" in the current PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, I don't quite understand your comment. So should I refactor it or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being vague. I mean that I would prefer e.g.
'\x1B' => output.push('␛'),
to be changed to
'\x1B' => output.push_str(if use_caret { "^[" } else { "␛" }),
(+ formatted however cargo fmt
prefers it)
Thank you for the update. I will review as soon as I get some time over. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! To be cautious, I would like us to not have the --caret
alias, at least not in the initial version. Otherwise this is Approved from me.
src/bin/bat/clap_app.rs
Outdated
.arg( | ||
Arg::new("caret-notation") | ||
.long("caret-notation") | ||
.alias("caret") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if we didn't make --caret
a short form of --caret-notation
. "caret notation" indeed seems to be the official term for this, but I don't think --caret
is clear enough on its own. I wouldn't rule out us adding an alias in the future, but since it is much easier to add stuff than to remove stuff, I would prefer if we started out without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This looks good - thank you! Just one question regarding the CLI. To avoid introducing (1) a new option and (2) a new option that can only be used together with another option, we could also think about modifying the Downside: this might break Another downside is that it looks a bit strange: What do others think of this option? |
I think we should leave But I like the idea of making the name of the new option more future-proof. Both OpenBSD |
Sorry for changing my mind, but I think it would be good to make the new option future proof!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this will affect the public API of the bat
library, it is worth putting some extra effort into designing the public API.
My preferred way to see how the public API is impacted is to run cargo public-api diff latest
after installing with cargo install --locked cargo-public-api
.
Here is the output I get when diffing the public API of your branch against the public API of the latest bat
release on crates.io (in the terminal there is syntax highlighting, but that gets lost on GitHub):
$ cargo public-api diff latest
Resolved `diff latest` to `diff 0.22.1`
Documenting bat v0.22.1
Finished dev [unoptimized + debuginfo] target(s) in 0.56s
Documenting bat v0.22.1 (/home/martin/src/bat)
Finished dev [unoptimized + debuginfo] target(s) in 0.63s
Removed items from the public API
=================================
(none)
Changed items in the public API
===============================
(none)
Added items to the public API
=============================
+pub bat::config::Config::nonprinting_notation: alloc::string::String
+pub fn bat::style::StyleComponents::clear(&mut self)
+pub fn bat::style::StyleComponents::insert(&mut self, component: bat::style::StyleComponent)
As we can see, bat::config::Config::nonprinting_notation: alloc::string::String
has been added to the public API.
I really think we should use an enum instead of a free form String
. See e.g. https://docs.rs/bat/latest/bat/enum.WrappingMode.html.
So can you introduce an enum please? Thanks!
Thank you for the reminder. I am unusually busy both privately and professionally, so I do not have time to look againt at this PR right now. I have not forgotten about it though. I would be surprised if it takes longer than a month before I get some time over. (Or maybe some other maintainer will take a look before that.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized we already use the term nonprintable
in Config::show_nonprintable
. So to be consistent with ourselves, I think we should rename NonprintingNotation
to NonprintableNotation
.
Sorry for changing my mind again :(
src/nonprinting_notation.rs
Outdated
#[derive(Debug, Clone, Copy, PartialEq, Eq)] | ||
pub enum NonprintingNotation { | ||
Caret, | ||
Unicode, | ||
} | ||
|
||
impl Default for NonprintingNotation { | ||
fn default() -> Self { | ||
NonprintingNotation::Unicode | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add docs and automatically derive Default please? Proposal:
/// How to print non-printable characters with
/// [crate::config::Config::show_nonprintable]
#[derive(Debug, Default, Clone, Copy, PartialEq, Eq)]
pub enum NonprintingNotation {
/// Use unicode notation (␇, ␊, ␀, ..)
#[default]
Unicode,
/// Use caret notation (^G, ^J, ^@, ..)
Caret,
}
As the failed check shows, this is not compatible with the minimum supported Rust version (1.60), as this feature wasn't stabilized until version 1.62 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the updates, this looks good to me now!
For the record, here is the bat
public API diff:
$ cargo public-api --omit blanket-impls --omit auto-trait-impls --omit auto-derived-impls diff origin/master..einfachIrgendwer0815/feature_caret_notation
Removed items from the public API
=================================
(none)
Changed items in the public API
===============================
(none)
Added items to the public API
=============================
+pub bat::config::Config::nonprintable_notation: bat::NonprintableNotation
+pub enum bat::NonprintableNotation
+pub bat::NonprintableNotation::Caret
+pub bat::NonprintableNotation::Unicode
/// How to print non-printable characters with | ||
/// [crate::config::Config::show_nonprintable] | ||
#[derive(Debug, Default, Clone, Copy, PartialEq, Eq)] | ||
pub enum NonprintableNotation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot one thing! We need to make the new enum #[non_exhaustive]
. Otherwise adding more notations will require a major semver version bump.
pub enum NonprintableNotation { | |
#[non_exhaustive] | |
pub enum NonprintableNotation { |
You don't have to add it though. I can add it before I merge this. I plan on merging it early next week, unless something else comes up.
Implements feature request #2429.
When using
--show-all
, non-printable characters are printed using the Unicode Control Pictures. This pr adds the flag--caret-notation
, which can only be used together with--show-all
. When--caret-notation
is set, non-printable characters are printed using caret notation instead.For example, without
--caret-notation
:With
--caret-notation
:Instead of
--caret-notation
, the alias--caret
can be used too.