You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A number of PDF files converted into text (mutool convert...) still have few innocuous control characters (typically 0x0c replacing a page break). But these bug Broot into displaying them as hex dumps.
Please consider this patch below for Broot to be more lenient and accept all UTF-8 encoded files, simply by replacing control characters with space (or an empty string).
Binary files are unaffected by this patch, as they get rejected earlier by not being UTF-8 compliant.
diff --git a/src/syntactic/syntactic_view.rs b/src/syntactic/syntactic_view.rs
self.total_lines_count += 1;let start = offset;
offset += line.len();
+ line = line.replace(|ch:char| ch.is_control() && !"\t\n\r".contains(ch)," ");
- for c in line.chars(){
- if !is_char_printable(c){
- debug!("unprintable char: {:?}", c);
- returnErr(ProgramError::UnprintableFile);
- }
- }
The text was updated successfully, but these errors were encountered:
A number of PDF files converted into text (
mutool convert...
) still have few innocuous control characters (typically 0x0c replacing a page break). But these bug Broot into displaying them as hex dumps.Please consider this patch below for Broot to be more lenient and accept all UTF-8 encoded files, simply by replacing control characters with space (or an empty string).
Binary files are unaffected by this patch, as they get rejected earlier by not being UTF-8 compliant.
The text was updated successfully, but these errors were encountered: