使ASP.NET Core端点同时支持JSON参数和流式文件

背景

我最近在开发一个中转网盘程序,网盘自身仅维护目录结构,内容的存储则由第三方块存储提供。

这样的设计下,上传文件的端点需要同时接受参数和内容:

请求参数

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
public record UploadFileModel
{
    [Display(Name = "路径")]
    public string BasePath { get => field ?? string.Empty; set; }
    [Display(Name ="描述")]
    public string? Description { get => field ?? string.Empty; set; } = "";
    [Display(Name = "标签(逗号分隔)")]
    public string Tags { get => field ?? string.Empty; set; } = "";
    [Display(Name = "请选择文件")]
    public IFormFile File { get; set; }
}

控制器端点

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
[HttpPost("UploadFile",Name ="UploadFile")]
public async Task<IActionResult> UploadFileAsync(UploadFileModel fileUploadModel, CancellationToken cancellationToken)
{
    if (ModelState.IsValid) 
    {
        var tags = (fileUploadModel.Tags ?? "").Split(",", StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries).ToArray();
        await using var fs = fileUploadModel.File.OpenReadStream();
        await fileService.CreateFileAsync(UserId, fileUploadModel.BasePath, fileUploadModel.File.FileName, fs,
            new ItemAttribute(fileUploadModel.Description,tags), cancellationToken);
        messageAccessor.Add(OperationMessageLevel.Success, "操作成功", "上传文件成功");
        return RedirectToRoute("MyDrive");
    }
    return BadRequest();
}

从功能上来说,上面的代码没有问题。

从实现上来说,则有可以优化的点:视文件大小,IFormFile会将其内容缓冲到内存或磁盘上:

The entire file is read into an IFormFile. IFormFile is a C# representation of the file used to process or save the file.

The disk and memory used by file uploads depend on the number and size of concurrent file uploads. If an app attempts to buffer too many uploads, the site crashes when it runs out of memory or disk space. If the size or frequency of file uploads is exhausting app resources, use streaming.

Any single buffered file exceeding 64 KB is moved from memory to a temp file on disk.

Temporary files for larger requests are written to the location named in the ASPNETCORE_TEMP environment variable. If ASPNETCORE_TEMP is not defined, the files are written to the current user’s temporary folder.

中转网盘并不处理内容的存储,缓冲不但增加了不必要的开销,还额外暴漏了拒绝服务的攻击面;更优的方式是使用流式存储,即将文件内容原样转发给块存储服务,让外界去操心这些细节

缓冲处理改为流式处理

为了实现流式处理,使用到multipart/form-data这个ContentType。此外,还需要手工获取请求载荷。这样做的原因有二:

  1. ASP.NET Core的模型绑定无法处理multipart/form-data
  2. HTTP请求流是无法Seek,当模型绑定读取完毕请求体发现无法处理,我们再介入已经为时已晚

下面是使用HttpClient构造的一个客户端程序,同时发送了Json参数和文件内容

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
using System.Net.Http.Headers;
using System.Net.Http.Json;

namespace Client;
internal class Program
{
    static async Task Main(string[] args)
    {
        var client = new HttpClient();
        var endpoint = "url_to_your_end_point";
        while (true)
        {
            try
            {
                var content = new MultipartFormDataContent();
                content.Add(JsonContent.Create(new {Name="Auser",Age=14}), "json");
                var binaryContent = new StreamContent(File.OpenRead(@"path_to_your_large_file"));
                binaryContent.Headers.ContentType = new("application/octet-stream");
                content.Add(binaryContent,"binary");
                var response = await client.PostAsync(endpoint, content);
                response.EnsureSuccessStatusCode();
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Error:{ex}");
            }
            finally
            {
                Console.WriteLine("Press enter to continue");
                Console.ReadLine();
            }
        }
    }
}

在服务端我们可以使用MultipartReader来手工处理multipart/form-data内容:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
using System.Text.Json;
using Microsoft.AspNetCore.Mvc;
using Microsoft.AspNetCore.WebUtilities;
using Microsoft.Extensions.Primitives;
using Microsoft.Net.Http.Headers;

namespace Server.Helper;
public class MultipartHelper<T>(HttpContext context) where T : class
{
    public async Task<IActionResult> ExecuteAsync(Func<T, Stream,CancellationToken,Task<IActionResult>> callback)
    {
        var ctx = context;
        var cancellationToken = context.RequestAborted;
        // make sure we have the correct header type
        if (!MediaTypeHeaderValue.TryParse(ctx.Request.ContentType, out MediaTypeHeaderValue? contentType)
            || !contentType.MediaType.Equals("multipart/form-data", StringComparison.OrdinalIgnoreCase))
        {
            throw new Exception("Incorrect mime-type");
        }

        T? jsonData = null;
        var multipartBoundary = GetBoundary(contentType, 70);
        var multipartReader = new MultipartReader(multipartBoundary, ctx.Request.Body);
        while (await multipartReader.ReadNextSectionAsync(cancellationToken) is { } section)
        {
            if (!MediaTypeHeaderValue.TryParse(section.ContentType, out MediaTypeHeaderValue? sectionType))
            {
                throw new Exception("Invalid content type in section " + section.ContentType);
            }

            if (sectionType.MediaType.Equals("application/json", StringComparison.OrdinalIgnoreCase))
            {
                jsonData = await JsonSerializer.DeserializeAsync<T>(section.Body, cancellationToken: cancellationToken);
            }
            else if (sectionType.MediaType.Equals("application/octet-stream", StringComparison.OrdinalIgnoreCase))
            {
                if (jsonData == null)
                {
                    throw new Exception("Invalid content type in section " + section.ContentType);
                }
                return await callback(jsonData, section.Body, cancellationToken);
            }
            else
            {
                throw new Exception("Invalid content type in section " + section.ContentType);
            }
        }

        throw new Exception();
    }
    static string GetBoundary(MediaTypeHeaderValue contentType, int lengthLimit)
    {
        var boundary = HeaderUtilities.RemoveQuotes(contentType.Boundary);
        if (StringSegment.IsNullOrEmpty(boundary))
        {
            throw new InvalidDataException("Missing content-type boundary.");
        }
        if (boundary.Length > lengthLimit)
        {
            throw new InvalidDataException($"Multipart boundary length limit {lengthLimit} exceeded.");
        }
        return boundary.ToString();
    }
}

这里进行了简单的封装,Controller将逻辑作为回调放到Func<T, Stream,CancellationToken,Task<IActionResult>> callback参数中就可以使用了

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
using System.Buffers;
using Microsoft.AspNetCore.Mvc;
using System.Security.Cryptography;
public record Student(string Name, int Age);
public class FileController(ILogger<FileController> logger) : Controller
{
    [RequestSizeLimit(100_000_000_000)]//100GB
    public async Task<IActionResult> UploadAsync()
    {
        var helper = new MultipartHelper<Student>(HttpContext);
        return await helper.ExecuteAsync(async (student, stream,token) =>
        {
            //do sth with the stream and parameter...
            logger.LogInformation($"Json param is {student}");
            using var owner = MemoryPool<byte>.Shared.Rent(1024);
            var buffer = owner.Memory;
            using IncrementalHash sha256 = IncrementalHash.CreateHash(HashAlgorithmName.SHA256);
            while (true)
            {
                int read = await stream.ReadAsync(buffer, token);
                if (read == 0)
                {
                    break;
                }
                sha256.AppendData(buffer.Span[..read]);
            }

            var hash = sha256.GetHashAndReset();
            logger.LogInformation("SHA256 is {sha256}", Convert.ToHexString(hash));
            return Ok();
        });
    }
}

踩坑

  1. 控制器方法上的CancellationToken参数会触发模型绑定,正确的做法是使用HttpContext.RequestAborted
1
2
3
4
5
public async Task<IActionResult> UploadAsync(CancellationToken cancellationToken)
{
    //CancellationToken触发了模型绑定,MultipartReader介入时请求流已经被读过了
    //正确的方法是保持无参,并使用HttpContext.RequestAborted
}
  1. Kestrel是基于异步设计的WebServer,请尽量使用异步方法以避免不必要的性能开销
使用 Hugo 构建
主题 StackJimmy 设计