Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: file upload fixed when using Base64 type in FilePicker and multipart/form-data in REST API #39332

Merged
merged 3 commits into from
Feb 18, 2025

Conversation

NilanshBansal
Copy link
Contributor

@NilanshBansal NilanshBansal commented Feb 18, 2025

Description

This PR fixes the file upload issues when we use the Base64 type in the File Picker Widget and the REST API uses a multipart/form-data type.

From the table here, this issue points to the 3rd one in the list.

Fixes #39227
Fixes #38177

Automation

/ok-to-test tags="@tag.Sanity"

🔍 Cypress test results

Tip

🟢 🟢 🟢 All cypress tests have passed! 🎉 🎉 🎉
Workflow run: https://github.com/appsmithorg/appsmith/actions/runs/13392351852
Commit: c731c65
Cypress dashboard.
Tags: @tag.Sanity
Spec:


Tue, 18 Feb 2025 14:36:12 UTC

Communication

Should the DevRel and Marketing teams inform users about this change?

  • Yes
  • No

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced multipart form data handling for file uploads, now supporting robust processing of base64-encoded content with options for custom filenames and MIME types.
    • Introduced specific error messages for invalid multipart data and base64 format issues.
  • Refactor

    • Streamlined the data processing workflow with clearer error messages and simplified logic for improved reliability during file upload operations.

Copy link
Contributor

coderabbitai bot commented Feb 18, 2025

Walkthrough

This PR enhances multipart form data handling in the DataUtils class by introducing several new helper methods and refining existing logic. The changes include the addition of overloaded methods to process base64-encoded data, extract MIME types, and add parts to the multipart body. The overall modifications streamline the processing of both base64 and regular multipart data, making the flow clearer and improving error messages for parsing failures.

Changes

File Path Summary of Changes
app/server/appsmith-interfaces/.../DataUtils.java - Added Methods:
processBase64Data (with and without custom filename/MIME type)
processRegularData
extractMimeType
addPartToBody
parseMultipartData
- Modified Logic: Updated populateFileTypeBodyBuilder to branch based on base64 delimiter and refined error handling
app/server/appsmith-interfaces/.../BasePluginErrorMessages.java - Added Constants:
ERROR_INVALID_MULTIPART_DATA
ERROR_INVALID_BASE64_FORMAT

Possibly related PRs

  • test: Cypress test for validating file uploads greater than 20MB #38496: The changes in the main PR focus on enhancing the DataUtils class for processing multipart form data and base64 content, while the retrieved PR introduces a Cypress test for validating file uploads using multipart form data, indicating a direct connection in functionality related to file handling.

Suggested labels

File upload issues, S3

Suggested reviewers

  • ApekshaBhosale
  • NandanAnantharamu
  • btsgh

Poem

In streams of code where data flows,
Base64 secrets now brightly glow,
With methods new and logic clear,
Our multipart dance draws near,
Uploads sing in perfect rhythm 🚀
Cheers to improvements we all hold dear!

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the Bug Something isn't working label Feb 18, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
app/server/appsmith-interfaces/src/main/java/com/appsmith/external/helpers/restApiUtils/helpers/DataUtils.java (1)

352-362: Switch to UTF-8 for broader compatibility.
Using ISO-8859-1 could limit support for extended characters. Adopting UTF-8 helps accommodate a wider range of text inputs.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88d3599 and cc0453b.

📒 Files selected for processing (1)
  • app/server/appsmith-interfaces/src/main/java/com/appsmith/external/helpers/restApiUtils/helpers/DataUtils.java (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: server-unit-tests / server-unit-tests
  • GitHub Check: server-spotless / spotless-check
🔇 Additional comments (8)
app/server/appsmith-interfaces/src/main/java/com/appsmith/external/helpers/restApiUtils/helpers/DataUtils.java (8)

44-44: The import is used as intended.
This import statement for Collections is valid and utilized in parseMultipartData.


303-321: Robust fallback handling for base64 vs. JSON.
This logic correctly branches between Base64 processing and JSON-based multipart parsing, ensuring comprehensive coverage of file inputs.


323-327: Convenient default overload.
Providing a simplified overload with default filename and MIME type is a neat approach to reduce code duplication.


329-350: Consider catching invalid Base64 data gracefully.
Although the current approach fails fast, gracefully handling potential decoding errors (e.g., malformed strings) might offer clearer error messages and debugging hints.


364-375: Straightforward metadata parsing.
Extracting the MIME type from the substring is concise and readable.


377-388: Good streaming approach for large files.
Reading bytes into a Flux<DataBuffer> is a sensible pattern for handling potentially large payloads.


390-401: Clear JSON parsing logic.
This approach neatly differentiates single objects from arrays while providing an informative exception if content is invalid.


403-415: Sufficient initial MIME-to-extension map.
Fallback to an empty extension is a useful default; consider expanding mappings for broader file types as needed.

@github-actions github-actions bot added Integrations Pod General Issues related to the Integrations Pod that don't fit into other tags. Integrations Pod Integrations Product Issues related to a specific integration Needs Triaging Needs attention from maintainers to triage Query & JS Pod Issues related to the query & JS Pod Query & Widgets Pod All issues related to Query, JS, Eval, and Widgets labels Feb 18, 2025
@NilanshBansal
Copy link
Contributor Author

/build-deploy-preview skip-tests=true

Copy link

Deploying Your Preview: https://github.com/appsmithorg/appsmith/actions/runs/13391123950.
Workflow: On demand build Docker image and deploy preview.
skip-tests: true.
env: ``.
PR: 39332.
recreate: .

Copy link

Deploy-Preview-URL: https://ce-39332.dp.appsmith.com

@NilanshBansal
Copy link
Contributor Author

@sneha122 @appsmithorg/query-widgets-pod I have tested the file upload with the following types of files and it works fine.
pdf [both large (>5MB) and small (<5MB) pdfs], mp3, jpeg, mp4, pptx, json, txt

@NilanshBansal NilanshBansal self-assigned this Feb 18, 2025
@NilanshBansal NilanshBansal added the ok-to-test Required label for CI label Feb 18, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
app/server/appsmith-interfaces/src/main/java/com/appsmith/external/helpers/restApiUtils/helpers/DataUtils.java (7)

325-351: LGTM! Well-structured base64 processing with proper error handling.

The base64 processing methods are well-organized with clear separation of concerns. The overloaded method provides good flexibility for custom filenames and MIME types.

Consider adding a brief comment explaining the format of fileValue parameter in the base method, similar to how you documented the metadataPart format.


384-395: Consider making the buffer size configurable.

The buffer size of 4096 bytes is hardcoded in the addPartToBody method. Consider making this configurable to allow tuning for different file sizes and memory constraints.

+    private static final int DEFAULT_BUFFER_SIZE = 4096;
+
     private void addPartToBody(
             MultipartBodyBuilder bodyBuilder,
             String key,
             byte[] bytes,
             ClientHttpRequest outputMessage,
             String filename,
             String mimeType) {
         Flux<DataBuffer> data = DataBufferUtils.readInputStream(
-                () -> new ByteArrayInputStream(bytes), outputMessage.bufferFactory(), 4096);
+                () -> new ByteArrayInputStream(bytes), outputMessage.bufferFactory(), DEFAULT_BUFFER_SIZE);
         bodyBuilder.asyncPart(key, data, DataBuffer.class).filename(filename).contentType(MediaType.valueOf(mimeType));
     }

372-382: Add input validation for MIME type extraction.

While the MIME type extraction works well, consider adding validation to ensure the extracted MIME type follows the correct format (type/subtype).

     private String extractMimeType(String metadataPart, String defaultMimeType) {
+        if (metadataPart == null || metadataPart.isEmpty()) {
+            return defaultMimeType;
+        }
+
         if (metadataPart.startsWith("data:")) {
             int endIndex = metadataPart.indexOf(';');
             if (endIndex > 5) {
-                return metadataPart.substring(5, endIndex);
+                String mimeType = metadataPart.substring(5, endIndex);
+                return mimeType.matches("^\\w+/[-+.\\w]+$") ? mimeType : defaultMimeType;
             } else {
-                return metadataPart.substring(5);
+                String mimeType = metadataPart.substring(5);
+                return mimeType.matches("^\\w+/[-+.\\w]+$") ? mimeType : defaultMimeType;
             }
         }
         return defaultMimeType;
     }

326-351: Add validation for base64 content.

Consider adding validation for the base64 content to handle malformed input gracefully.

 private void processBase64Data(
         String fileValue,
         String key,
         MultipartBodyBuilder bodyBuilder,
         ClientHttpRequest outputMessage,
         String filename,
         String defaultMimeType) {
     String[] parts = fileValue.split(BASE64_DELIMITER, 2);
     if (parts.length != 2) {
         throw new AppsmithPluginException(
                 AppsmithPluginError.PLUGIN_DATASOURCE_ARGUMENT_ERROR, ERROR_INVALID_BASE64_FORMAT);
     }

     String metadataPart = parts[0];
     String base64Content = parts[1];
+    // Validate base64 content
+    if (!base64Content.matches("^[A-Za-z0-9+/]*={0,2}$")) {
+        throw new AppsmithPluginException(
+                AppsmithPluginError.PLUGIN_DATASOURCE_ARGUMENT_ERROR,
+                "Invalid base64 content format");
+    }
     String mimeType = extractMimeType(metadataPart, defaultMimeType);
     byte[] decodedBytes = Base64.getMimeDecoder().decode(base64Content);

     addPartToBody(bodyBuilder, key, decodedBytes, outputMessage, filename, mimeType);
 }

354-363: Consider using configurable character encoding.

The method currently uses ISO_8859_1 encoding. Consider making the encoding configurable to support different character sets.

 private void processRegularData(
         String data,
         String key,
         MultipartBodyBuilder bodyBuilder,
         ClientHttpRequest outputMessage,
         String filename,
-        String mimeType) {
-        byte[] bytes = data.getBytes(StandardCharsets.ISO_8859_1);
+        String mimeType,
+        Charset charset) {
+        byte[] bytes = data.getBytes(charset != null ? charset : StandardCharsets.ISO_8859_1);
         addPartToBody(bodyBuilder, key, bytes, outputMessage, filename, mimeType);
 }

303-323: Add method documentation for clarity.

The method handles both base64 and regular multipart data processing. Consider adding Javadoc to explain:

  • The expected format of input data
  • The difference between base64 and regular data handling
  • The purpose of each processing path

325-351: Add unit tests for base64 processing.

The base64 processing methods handle various scenarios but lack test coverage. Consider adding tests for:

  • Valid base64 data with different MIME types
  • Invalid base64 format
  • Edge cases with missing or malformed metadata

Would you like me to help generate unit test cases for these scenarios?

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc0453b and c731c65.

📒 Files selected for processing (2)
  • app/server/appsmith-interfaces/src/main/java/com/appsmith/external/exceptions/pluginExceptions/BasePluginErrorMessages.java (1 hunks)
  • app/server/appsmith-interfaces/src/main/java/com/appsmith/external/helpers/restApiUtils/helpers/DataUtils.java (3 hunks)
👮 Files not reviewed due to content moderation or server errors (2)
  • app/server/appsmith-interfaces/src/main/java/com/appsmith/external/exceptions/pluginExceptions/BasePluginErrorMessages.java
  • app/server/appsmith-interfaces/src/main/java/com/appsmith/external/helpers/restApiUtils/helpers/DataUtils.java
⏰ Context from checks skipped due to timeout of 90000ms (11)
  • GitHub Check: perform-test / ci-test / ci-test (19)
  • GitHub Check: perform-test / ci-test / ci-test (15)
  • GitHub Check: perform-test / ci-test / ci-test (13)
  • GitHub Check: perform-test / ci-test / ci-test (12)
  • GitHub Check: perform-test / ci-test / ci-test (10)
  • GitHub Check: perform-test / ci-test / ci-test (7)
  • GitHub Check: perform-test / ci-test / ci-test (6)
  • GitHub Check: perform-test / ci-test / ci-test (5)
  • GitHub Check: perform-test / ci-test / ci-test (3)
  • GitHub Check: perform-test / ci-test / ci-test (0)
  • GitHub Check: server-unit-tests / server-unit-tests
🔇 Additional comments (9)
app/server/appsmith-interfaces/src/main/java/com/appsmith/external/exceptions/pluginExceptions/BasePluginErrorMessages.java (3)

24-27: LGTM! Clear and descriptive error messages.

The new error message constants follow the existing pattern and provide clear feedback for validation failures.


24-27: LGTM! Clear and descriptive error messages.

The new error message constants provide clear guidance about the expected formats for multipart data and base64 content.


24-27: LGTM! Clear and descriptive error messages.

The new error message constants follow the existing pattern and provide clear guidance for error scenarios.

app/server/appsmith-interfaces/src/main/java/com/appsmith/external/helpers/restApiUtils/helpers/DataUtils.java (6)

51-52: LGTM! Proper error message imports.

The error messages are imported statically, following Java best practices.


51-52: LGTM! Consistent error handling with descriptive messages.

The error handling is well-implemented using the new constants, providing clear feedback for invalid formats.

Also applies to: 238-239, 341-343, 404-406


365-395: LGTM! Well-documented and reusable helper methods.

The helper methods are well-structured with clear documentation and examples. The buffer size of 4096 bytes in addPartToBody is a good choice for streaming data.


44-44: LGTM! Necessary imports added.

The new imports support the enhanced multipart form data handling.

Also applies to: 51-52


365-382: LGTM! Well-documented MIME type extraction.

The method is well-documented with examples and handles edge cases gracefully by falling back to the default MIME type.


397-407: LGTM! Improved error handling and return type.

The method now:

  • Returns a strongly-typed List
  • Provides specific error messages for invalid data
  • Handles both single object and array cases efficiently

@NilanshBansal NilanshBansal merged commit f05e3be into release Feb 18, 2025
45 checks passed
@NilanshBansal NilanshBansal deleted the fix/issue-39227/file-upload-base64-multipart branch February 18, 2025 14:43
@sneha122
Copy link
Contributor

@NilanshBansal Let's create a separate task and add test cases for this functionality as discussed

@github-actions github-actions bot added QA Needs QA attention Core Query Execution Issues related to the execution of all queries labels Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Core Query Execution Issues related to the execution of all queries Integrations Pod General Issues related to the Integrations Pod that don't fit into other tags. Integrations Pod Integrations Product Issues related to a specific integration Needs Triaging Needs attention from maintainers to triage ok-to-test Required label for CI QA Needs QA attention Query & JS Pod Issues related to the query & JS Pod Query & Widgets Pod All issues related to Query, JS, Eval, and Widgets
Projects
None yet
2 participants