计算机图形学

发布日期: 2023-02-12

文章字数: 6.7k

阅读时长: 26 分

阅读次数:

学习一门技术，总是从最简单的方式入手，逐步理解这门技术的基本组成和运行过程，对于Vulkan也不例外，但是由于Vulkan本身的特点，即所有的设置都需要显式给出，因此即便一个非常简单的hello world程序，在Vulkan中也颇为复杂。

Vulkan tutorials中通过绘制三角形，描述了Vulkan中的各种对象，以及整个程序的运行流程，但是这种方式很容易让人陷入到API的细节中，导致只见树木不见森林，因此，本篇文章是在学完了Vulkan tutorials的绘制三角形部分后，重新查看代码，梳理过程得到的回顾性文章。

通过本文章，你可以：

了解Vulkan运行所需的基本对象；
Vulkan程序运行的一般流程；

你不会知道：

Vulkan API使用的详细方式；
文中每个涉及到的API中字段的具体含义；

初始化

对于一个Vulkan程序，在初始化阶段需要完成几件工作：

检查并开启validation layers
检查并开启extensions
创建instance
检查可用的physical devices
创建代表physical devices的logical devices

检查并开启validation layers(optional)

这一步不是必须的，但是为了检查检查和处理错误，一般在debug阶段都需要开启，其作为函数调用过程的hook存在，可以起到profiling/replay等很多作用。

检查可用的validation layers的伪代码如下：

uint32_t layerCount;
// query available layers
vkEnumerateInstanceLayerProperties(&layerCount, nullptr);
// return layers name
std::vector<VkLayerProperties> availableLayers(layerCount);
vkEnumerateInstanceLayerProperties(&layerCount, availableLayers.data());

一般需要找到对应VK_LAYER_KHRONOS_validation的layer才可以。

检查开启extensions(optional)

这个步骤对于Vulkan程序甚至也不是必须的，如果是为了渲染任务，需要将渲染结果显示在屏幕上，则需要和Window系统打交道，则需要开启一系列的extension，如VK_KHR_surface；或者在MacOS等原生不支持Vulkan的平台上开发程序，则需要VK_EXT_metal_surface。

但是渲染不一定要展现，或者Vulkan程序执行的是computer任务，因此这一步也是可选的。

检查可用的extensions与validation layers的步骤类似：

uint32_t extensionCount = 0;
vkEnumerateInstanceExtensionProperties(nullptr, &extensionCount, nullptr);
vector<VkExtensionProperties> extensions(extensionCount);
vkEnumerateInstanceExtensionProperties(nullptr, &extensionCount, extensions.data());

其实，extensions是针对特定的layer的，因为这里将pLayerName设置为nullptr，所以只会返回Vulkan提供的extensions实现。

创建Instance

上述两个步骤都是为了创建instance做准备，VkInstance存储着每个application的state信息，即每个application都要有一个对应的instance实例。

在创建VkInstance实例时，除了一般的信息，其中最重要的就是对上述的validation layer和extension的验证，创建instance的代码如下：

VkInstanceCreateInfo createInfo{};
createInfo.enabledExtensionCount = static_cast<uint32_t>(extensions.size());
createInfo.ppEnabledExtensionNames = extensions.data();

createInfo.enabledLayerCount = static_cast<uint32_t>(validationLayers.size());
createInfo.ppEnabledLayerNames = validationLayers.data();

vkCreateInstance(&createInfo, nullptr, &instance);

此时，针对一个application，我们配置好了相关的信息。之后，为了让GPU做事，需要对相关的组件进行配置。

查询Physical devices

在Vulkan中，GPU硬件是通过Physical devices表示的。

A physical device usually represents a single complete implementation of Vulkan (excluding instance-level functionality) available to the host, of which there are a finite number.

因为代表具体的硬件，所以是真实存在并且有限的，为了在程序中使用需要查询具体有多少physical devices。在Vulkan中physical devices通过VkPhysicalDevice来表示。

uint32_t deviceCount = 0;
vkEnumeratePhysicalDevices(instance, &deviceCount, nullptr);
vector<VkPhysicalDevice> devices(deviceCount);
vkEnumeratePhysicalDevices(instance, &deviceCount, devices.data());

因为physical devices是对应物理组件的，因此我们是check，而不是create。

创建logical devices

虽然GPU硬件通过physical devices可以表示，但是在真实使用过程中，我们一般使用其lgocal representation，即logical devices，通过VkDevice表示。

A logical device represents an instance of that implementation(physical devices) with its own state and resources independent of other logical devices.

VkDevice实例是几乎所有Vulkan API使用的核心对象，而后续使用的如queue等核心对象也是根据其构建出来的。

为了创建VkDevice需要完成以下工作：

查询符合条件的queue
检查可用的validation layer和extensions

查询可用的queue

queue是application向GPU提交任务的渠道，作为application和GPU之间的中介，决定了任务如何被提交以及执行的顺序。

类似于沟通不同车站的各种道路，queue也被划分为不同的类型，比如用于渲染的graphics pipeline command和用于compute的compute pipeline等。我的理解是这样可以提高效率，就像自行车道和高速通道不能合并一样，否则两个都做不好。

对于渲染任务，需要VK_QUEUE_GRAPHICS_BIT类型的queue，因此需要首先找到该queue。

uint32_t queueFamilyCount = 0;
vkGetPhysicalDeviceQueueFamilyProperties(device, &queueFamilyCount, nullptr);

vector<VkQueueFamilyProperties> queueFamilies(queueFamilyCount);
vkGetPhysicalDeviceQueueFamilyProperties(device, &queueFamilyCount, queueFamilies.data());

然后对所有的queue families遍历得到对应的index，然后从该device中得到对应的queue。

vkGetDeviceQueue(device, index, 0, &graphicsQueue);

检查可用的validation layer和extensions

在创建instance时，会检查并开启对应的validation layers和extensions，在创建device时也会有类似的操作。

Vulkan中提供了两种类型的extensions：

Instance-specific: This provides global-level extensions
Device-specific: This provides physical-device-specific extensions

伪代码如下：

VkDeviceCreateInfo createInfo{};
createInfo.ppEnabledExtensionNames = ["VK_KHR_swapchain"];
createInfo.ppEnabledLayerNames = validationLayers.data();
vkCreateDevice(physicalDevice, &createInfo, nullptr, &device)

其中要注意一点，根据此文，device-specific的validation layer已经在最新版中被忽略了，即可以不用考虑了，但是为了兼容旧版本，还是建议加上。

总结

这个过程总结如下图：hexo

初始化的过程以及涉及到的Vulkan对象

Presentation(optional)

对于渲染结果的显示并不是必须的，但是为了查看Vulkan程序是否运行正常，一般都需要加上对于显示的支持。由于不是必需的操作，所以关于渲染结果的呈现，一般是通过extension实现的。

在初始化阶段，有两个地方涉及到extensions的开启：

一个是创建instance时；
另一个是创建logical devices时；

因此，为了实现渲染结果的展示，需要借用上述的步骤，完成相关WSI extensions的开启。

为了presentation，需要完成以下几件事情：

开启相关的extensions
检查并创建window和surface
创建用于presentation的queue
创建swapchain并添加关联的images
获取swapchain中的image并创建对应的imageview

开启presentation使用的extensions

首先需要开启VK_KHR_surface extension，需要在instance创建时指定。

VkInstanceCreateInfo createInfo{};
createInfo.enabledExtensionCount = static_cast<uint32_t>(extensions.size());
createInfo.ppEnabledExtensionNames = extensions.data();

如果使用glfw的话，为了创建window，这个extension是必须的，因此可以直接从glfw的API中获得：

uint32_t glfwExtensionCount = 0;
const char** glfwExtensions;
// glfw require vulkan extension
glfwExtensions = glfwGetRequiredInstanceExtensions(&glfwExtensionCount);

另外，还需要开启VK_KHR_swapchain，这是一个device-level的extension，需要在创建logical device时开启，如上文已经在检查可用的validation layer和extensions部分给出。

检查并创建window和surface

vulkan API均为平台独立的，即与任意平台均解耦，但是为了将渲染的结果显示在window，必须处理平台特定的操作，这是十分繁杂的。而glfw library为我们处理好一切细节问题，因此我们可以直接使用glfw创建window。

GLFWwindow *window = glfwCreateWindow(800, 600, "Vulkan", nullptr, nullptr);

完成了window的创建后，在开启上述相关extensions的基础上，需要创建VkSurfaceKHR对象，这个对象是一个对native platform的抽象表示，用来将渲染的结果呈现在上面。

VkSurfaceKHR surface;
glfwCreateWindowSurface(instance, window, nullptr, &surface)

这里也可以直接调用glfw的API直接完成创建。

创建用于presentation的queue

虽然vulkan 支持 window system integration，但是并不意味着底层的device在硬件的level上支持WSI，因此需要对device上的每个queue进行检查，找到支持presentation的queue。

VkBool32 presentSupport = false;
for_each queue in queues_in_device:
    vkGetPhysicalDeviceSurfaceSupportKHR(device, queuefamily.index, surface, &presentSupport);

当确认了存在这样的一个queue，就需要从device中得到该queue。

VkQueue presentQueue;
vkGetDeviceQueue(device, queueFamily.index, 0, &presentQueue);

创建swapchain

在vulkan中没有类似openGL中default framebuffer概念，与之对应的是swapchain，用来存储待显示到屏幕的渲染结果，其本质上就是一个image queue，用来暂存一系列渲染结果。

在前面，我们已经开启swapchain对应的extension了，但是这样并不够，还要确认swapchain的一系列特性，包括以下：

VkSurfaceCapabilitiesKHR capabilities;
vkGetPhysicalDeviceSurfaceCapabilitiesKHR(device, surface, &details.capabilities);

vector<VkSurfaceFormatKHR> formats;
uint32_t formatCount;
vkGetPhysicalDeviceSurfaceFormatsKHR(device, surface, &formatCount, nullptr);
vkGetPhysicalDeviceSurfaceFormatsKHR(device, surface, &formatCount, formats.data());

vector<VkPresentModeKHR> presentModes
uint32_t presentModeCount;
vkGetPhysicalDeviceSurfacePresentModesKHR(device, surface, &presentModeCount, nullptr);
vkGetPhysicalDeviceSurfacePresentModesKHR(device, surface, &presentModeCount, presentModes.data());

VkSwapchainCreateInfoKHR createInfo{};
vkCreateSwapchainKHR(device, &createInfo, nullptr, &swapChain)

获取swapchain中的image并创建对应的imageview

在创建完成swapchain之后，需要取出其中的image，这些image在渲染过程中需要作为渲染对象使用。

vector<VkImage> swapChainImages;
uint32_t imageCount;
vkGetSwapchainImagesKHR(device, swapChain, &imageCount, nullptr);
swapChainImages.resize(imageCount);
vkGetSwapchainImagesKHR(device, swapChain, &imageCount, swapChainImages.data());

为了在渲染过程中使用这些image，需要创建imageview，可以将imageview视为数据库中的关系表的视图（可以用来进行权限隔离），确定了访问哪部分image数据，怎么访问image中的数据。

VkImageView imageView;
VkImageViewCreateInfo createInfo{};
vkCreateImageView(device, &createInfo, nullptr, &imageView);

总结

为渲染对象进行presentation的准备工作

准备renderpass

在之前，我们已经完成了初始化，并且知道要将最终的渲染结果显示到哪里去。但是在渲染过程中，涉及到很多的步骤，每个步骤都需要输出中间结果，这个中间结果可能被后续的步骤使用，那么如何设置这些中间结果呢？首先，要明确的是，这些中间结果都需要存储在内存中，以便被后续的步骤读取，Vulkan中提出了一个概念——renderpass，专门用来设置中间渲染结果的存储位置以及如何使用这些存储，它描述了整个渲染过程中数据的流向是怎样的。

renderpass的创建，完全依赖于VkRenderPassCreateInfo，其中需要三类信息，但是归根到底，涉及两类对象：

subpass：出于优化的考虑，将renderpass可以划分为不同的subpass；
attachments：本质是一系列的image，用来在渲染过程中使用。这个术语有点意思，不直接叫images，而是作为attachments，放到renderpass中。

创建renderpass需要的信息

因此，我们需要完成以下工作：

考虑需要的attachments
构建subpass
创建framebuffer

添加attachments

因为attachments本质上是images，就是用来存储渲染过程中的中间结果的。一般在渲染前和渲染后都可以指定相应的操作，如代码所示：

VkAttachmentDescription colorAttachment{};
// 指定image view的format
colorAttachment.format = swapChainImageFormat;
// 指定在渲染开始时，先将attachment对应的image内容clear
colorAttachment.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
// 指定在渲染结束后，将attachment对应的image存储到对应的内存中以便后续读取
colorAttachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;

可以分别为color与depth信息建立各自的attachments。

注意，这里的attachments还未与真正的image view建立关联，这需要通过framebuffer来实现。

设置renderpass的subpass

A subpass represents a phase of rendering that reads and writes a subset of the attachments in a render pass. Rendering commands are recorded into a particular subpass of a render pass instance.

为了优化操作，renderpass进一步分为了subpass。subpass也是需要使用attachments，方式如下：

VkAttachmentReference colorAttachmentRef{};
VkSubpassDescription subpass{};
subpass.colorAttachmentCount = 1;
subpass.pColorAttachments = &colorAttachmentRef;

另外，subpass之间可以存在依赖关系，类似于execution dependencies和memory dependencies，只不过不用Vulkan中的sync primitives而已。

VkSubpassDependency dependency{};

完成了上述两个步骤后，就可以创建renderpass对象了：

VkRenderPassCreateInfo renderPassInfo{};
vkCreateRenderPass(device, &renderPassInfo, nullptr, &renderPass)

创建framebuffer

Framebuffers represent a collection of specific memory attachments that a render pass instance uses.

上述renderpass中的attachments中只提供了各种描述信息，那么如何访问这些attachments，需要和memory建立关系，这就需要通过framebuffer。

创建一个framebuffer，需要绑定到特定的renderpass中，这保证该framebuffer只在该renderpass中使用。因为，framebuffer是将renderpass中的attachments与memory建立关系，因此需要在创建过程中直接引用。

VkFramebuffer framebuffer;
VkImageView attachments[1];

VkFramebufferCreateInfo framebufferInfo{};
framebufferInfo.renderPass = renderPass;
framebufferInfo.attachmentCount = 1;
// 在renderpass中作为attachments使用
framebufferInfo.pAttachments = attachments;
vkCreateFramebuffer(device, &framebufferInfo, nullptr, framebuffer);

总结

为了创建renderpass，我们需要attachments，这些attachments指定了renderpass过程中针对中间渲染结果的配置信息，并且通过framebuffer与memory建立联系；之后renderpass细分的subpass可以利用这些attachments信息执行相关操作。

创建renderpass

创建descriptor set

这一步是创建pipeline必需的，简单来讲，作为渲染pipeline中的核心组件之一——shader，是如何访问内存资源的呢？就是通过descriptor，其告诉shader如何找到所需的内存资源。

David-DiGioiahu绘制了一张图，对于理解：

如何建立shader和descriptor之间的关系；
如何创建descriptor set；
如何在pipeline中访问descriptor set；
以及如何更新；
非常有帮助，如下：

Descriptor sets以及与其他对象的关系

在这个过程中，对于desciptor set的创建，很清晰的描述。涉及不同的Vulkan对象，

A descriptor is an opaque data structure representing a shader resource;
Descriptors are organized into descriptor sets, which are bound during command recording for use in subsequent drawing commands;
The arrangement of content in each descriptor set is determined by a descriptor set layout;

过程总结如下：

创建descriptor pool：提高desciptor的使用效率；
指定descriptor layout：提供了descriptor的type信息；
从descriptor pool中根据descriptor layout中的信息分配desciptor set；

可视化的过程如文章所示，如下图：
descriptor set的构建过程

设置descriptor layout

这个过程，简单来说，就是为pipeline中的shaders要使用的内存资源，提供type信息，例如uniform buffer，可以直接访问buffer。

创建descriptor layout的代码如下：

VkDescriptorSetLayoutBinding layoutBind[2]; 
layoutBind[0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
layoutBind[0].binding = 0; 
layoutBind[0].stageFlags = VK_SHADER_STAGE_VERTEX_BIT;

layoutBind[1].descriptorType =VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER; 
layoutBind[1].binding = 0;
layoutBind[1].stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT;

VkDescriptorSetLayoutCreateInfo descriptorLayout = {};
descriptorLayout.pBindings = layoutBind;
VkDescriptorSetLayout descLayout[2];
vkCreateDescriptorSetLayout (device, &descriptorLayout, NULL, descLayout.data());

创建descriptor pool

在Vulkan中，descriptor sets不能直接创建，而是需要从descriptor pool中分配。descriptor pool的创建需要VkDescriptorPoolCreateInfo提供的信息。

VkDescriptorPool descriptorPool;

// pool中有VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER类型的descriptors共10个
VkDescriptorPoolSize poolSize{}; 
poolSize.type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; 
poolSize.descriptorCount = 10;

VkDescriptorPoolCreateInfo poolInfo{}; 
poolInfo.pPoolSizes = &poolSize;

// 最多可以从该pool中分配3个descriptor sets
poolInfo.maxSets = 3;

vkCreateDescriptorPool(device, &poolInfo, nullptr, &descriptorPool);

分配descriptor set

从descriptor pool中分配descriptor sets按照如下的方式：

VkDescriptorSetAllocateInfo allocInfo{};
allocInfo.descriptorPool = descriptorPool;
allocInfo.pSetLayouts = descLayout.data();

vkAllocateDescriptorSets(device, &allocInfo, descriptorSets.data());

之后就可以正常使用descriptor sets了。

总结

descriptor sets的整个创建过程总结如下：
descriptor sets的生成过程

创建pipeline

pipeline将渲染过程中的一系列操作打包在一起，从读取vertex和texture的数据，到最终将渲染的结果写到swapchain中。通过其中的每个步骤设置相关的states，使得渲染过程能够顺序进行。

Vulkan spec中将pipeline划分为以下几个阶段：

Vulkan Spec中的pipeline

其中，pipeline中的stages具体可以分为两种类型：

fixed function stage：不可编程，基本由硬件实现，但是可以设置参数；
shader stage：可编程，即通过shader改变渲染的行为；

整个pipeline创建过程中的核心工作，就是为了这些stages设置合适的state参数，以便后续的command能够顺利执行。

构建shader modules

在pipeline中使用的shaders，一般是SPIR-V格式的，可以通过GLSL格式转换而来。如果在Vulkan中使用，需要嵌入VkShaderModule中，代码如下：

VkShaderModuleCreateInfo createInfo{};
createInfo.codeSize = ...
// 绑定SPIR-V格式的shader代码
createInfo.pCode = ...
VkShaderModule shaderModule;
vkCreateShaderModule(device, &createInfo, nullptr, &shaderModule);

VkPipelineShaderStageCreateInfo shaderStage{};
shaderStage.stage = VK_SHADER_STAGE_FRAGMENT_BIT;
shaderStage.module = shaderModule;
shaderStage.pName = "main";

设置 fixed stages的状态

根据pipeline中不同的stages，需要设置不同的state，这些states在VkGraphicsPipelineCreateInfo 中指明。

下图来自于这里，表明了创建pipleine过程中涉及到的各种state。

其中具体可以分为两类：

固定的state：即设置之后就不能更改；
dynamic state：在runtime中可以动态更改，使得不必重新创建pipeline，因为创建pipelien是expensive task，viewport和scissors就属于这一类；

创建pipeline需要设置的state

各种状态的设置伪代码如下：

// 如何读取和解析vertex data
VkPipelineVertexInputStateCreateInfo vertexInputInfo{};
VkPipelineInputAssemblyStateCreateInfo inputAssembly{};
// 控制viewport transformation
VkPipelineViewportStateCreateInfo viewportState{};
// 如何进行光栅化
VkPipelineRasterizationStateCreateInfo rasterizer{};
// 如何进行depth/stencil test
VkPipelineDepthStencilStateCreateInfo depthStencilState{};
// 控制如何进行反走样
VkPipelineMultisampleStateCreateInfo multisampling{};
// 如何结合此次渲染的结果和之前已有的数据
VkPipelineColorBlendStateCreateInfo colorBlend{};
// 设置dynamic state, s
VkPipelineDynamicStateCreateInfo dynamicState{};

创建pipeline layout

pipeline layout是啥？

Access to descriptor sets from a pipeline is accomplished through a pipeline layout.

简单来讲，在pipeline中存在可编程的shader，那么这些shader在运行时如何处理输入和输出数据呢？即，shader需要访问对应的资源，而访问操作是通过descriptor set作为中介实现的。总结下来，shader要访问内存资源，需要通过descriptor set。而shader代码本身作为state信息被绑定到pipeline中，如果想要在pipeline中让shader访问内存资源，就需要将descriptor set与pipeline建立联系，pipeline layout就是联系的中介。

在完成上文descriptor layout的设置后，就可以将其通过pipeline layout绑定到pipleine中，如下：

VkPipelineLayoutCreateInfo pipelineLayoutCI{};
pipelineLayoutCI.pSetLayouts = descLayout.data();
vkCreatePipelineLayout(device, &pipelineLayoutCI, NULL, &pipelineLayout);

完成pipeline创建

完成了各种state的设置之后，就需要将它们绑定到pipeline的创建信息中，如下所示：

VkGraphicsPipelineCreateInfo pipelineInfo{};
pipelineInfo.pStages = &shaderStages;
pipelineInfo.pVertexInputState = &vertexInputInfo;
pipelineInfo.pInputAssemblyState = &inputAssembly;
pipelineInfo.pViewportState = &viewportState;
pipelineInfo.pRasterizationState = &rasterizer;
pipelineInfo.pMultisampleState = &multisampling;
pipelineInfo.pDepthStencilState = &depthStencilState;
pipelineInfo.pColorBlendState = &colorBlend;
pipelineInfo.pDynamicState = &dynamicState;

为了完成pipeline的创建，还需要指定pipelinecache/renderpass/pipelinelayout：

VkPipelineCache：因为pipeline的创建是expensive task，而且一般创建pipeline时，变化的state不会很多，因此完全从头开始创建是不划算的，因此可以利用之前创建好的pipeline，提高效率；
```
VkPipelineCache pipelineCache; 
VkPipelineCacheCreateInfo pipelineCacheInfo;
vkCreatePipelineCache(device, &pipelineCacheInfo, NULL, &pipelineCache);
```
之后指定renderpass，说明该pipeline用在哪个renderpass环境中，即需要将pipeline与renderpass绑定。
```
pipelineInfo.renderPass = renderPass;
```
指定pipeline layout；
```
pipelineInfo.layout = pipelineLayout
```

之后，就可以创建pipeline对象了，如下：

vkCreateGraphicsPipelines(device, pipelineCache, 1, &pipelineInfo, NULL, &pipeline);

总结

创建pipeline的过程可以总结如下：

创建pipeline的过程

准备command buffer

到目前为止，我们已经完成了准备工作——初始化，设置好了渲染结果最终的显示地方，同时也设置从渲染开始到渲染结束经过的pipeline。那么之后要完成的工作，就是下达一系列的commands，让device中的queue去执行并最终显示在屏幕上。

所有要提交给GPU去执行的commands，都要记录在command buffer中，这种方式批量提交commands给GPU，提高了GPU处理commands的效率。而且，还可以利用多线程的优势，将多个command buffer同时提交给GPU去执行。

为了初始化command buffer需要两个步骤：

创建command pool
分配commanbuffer

创建command pool

Command pools manage the memory that is used to store the buffers and command buffers are allocated from them.

类似于descriptor pool与descriptor sets的关系，command buffer也是从command pool中分配得到的，因此首先需要创建command pool。command pool的创建与具体的queue family绑定，代码如下：

VkCommandPoolCreateInfo poolInfo{};
poolInfo.queueFamilyIndex = queueFamily.Index;
vkCreateCommandPool(device, &poolInfo, nullptr, &commandPool);

分配command buffer

command buffer均是从command pool中分配得到的，代码如下：

VkCommandBufferAllocateInfo allocInfo{};
allocInfo.commandPool = commandPool;
allocInfo.commandBufferCount = 1;
vkAllocateCommandBuffers(device, &allocInfo, &commandBuffer);

从command pool中得到的command buffer只能提交到创建该command pool的queue family对应的queues中。

总结

整个过程如下：
command buffer的创建过程

准备vertex data

为了完成渲染任务，需要向Vulkan API提供vertex data，以组成primitives。vertex data的准备，在vulkan中涉及资源的创建和管理。

在vulkan tutorials中开始的时候，直接在shader中写入vertex data以及对应的color等信息，但是在一般情况下，这些数据都是由程序写入的。为了完成这个任务，需要将数据从CPU拷贝到GPU，然后GPU才能使用这些数据。具体来说，这个过程包括：

创建host memory，即CPU对应的memory；
创建device memory，也就是GPU对应的memory

在Vulkan中将memory分成了上述两种类型，对于渲染任务来说，直接从device memory中读取数据更快，但是我们的程序运行在CPU侧，不能直接访问device memory，因此：

需要在host memory和device memory之间建立关系

创建host memory

这里所说的创建host memory，即加载vertex data到内存中，如下：

struct Vertex {
    glm::vec2 pos;
    glm::vec3 color;
}

const std::vector<Vertex> vertices = {
{{-0.5f, -0.5f}, {1.0f, 0.0f, 0.0f}},
{{0.5f, -0.5f}, {0.0f, 1.0f, 0.0f}},
{{0.5f, 0.5f}, {0.0f, 0.0f, 1.0f}},
{{-0.5f, 0.5f}, {1.0f, 1.0f, 1.0f}}
};

创建device memory

在Vulkan中有两类资源，

buffer：即，普通的内存，这里用于vertex data的存储和传输；
image：不仅包括普通的内存，包括具体的格式和metadata；

buffer的创建如下所示：

VkBuffer buffer;
VkBufferCreateInfo bufferInfo{};
bufferInfo.size = size;
bufferInfo.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT;
vkCreateBuffer(device, &bufferInfo, nullptr, &buffer)

其中，

usage表明该buffer中的数据后续可以被copy到另外的地方。

此时，即便buffer已经被创建成功，但是并没有任何的device memory与之关联，即无法让GPU使用vertex data，因此还需要创建device memory，如下：

VkDeviceMemory bufferMemory;

VkMemoryRequirements memRequirements;
vkGetBufferMemoryRequirements(device, buffer, &memRequirements);

VkMemoryAllocateInfo allocInfo{};
allocInfo.allocationSize = memRequirements.size;

VkPhysicalDeviceMemoryProperties memProperties;
vkGetPhysicalDeviceMemoryProperties(physicalDevice, &memProperties);
allocInfo.memoryTypeIndex = memProperties.memoryTypes.index;

vkAllocateMemory(device, &allocInfo, nullptr, &bufferMemory);
vkBindBufferMemory(device, buffer, bufferMemory, 0);

这种方式是创建memory的标准模式：

查询对应buffer的memory创建要求；
确定memory type
完成memory的创建
将memory与resource对象绑定在一起

memory mapping from device to host

device memory不能被host直接访问，需要将device memory mapping到host上，这样对于device memory，则会有一个physical address和两个virtual address，分别对应device和host侧。

需要注意的一点是device memory默认是对host不可访问的，如果要想完成memory mapping，需要在创建buffer的时候指定VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT。

void* data;

vkMapMemory(device, bufferMemory, 0, bufferSize, 0, &data);
memcpy(data, vertices.data(), (size_t) bufferSize);
vkUnmapMemory(device, bufferMemory);

使用vkMapMemory完成memory mapping后，在host侧得到了对应device memory的pointer，即这里的data，从而操作对应位置的数据。在完成CPU内存中的数据复制到GPU内存中之后，可以结束device memory在host上的mapping，这样之后就不能再通过data来读写GPU内存上的数据了。

创建device-local memory

到目前为止，已经将CPU memory中的vertex data传输到GPU memory中了，但是这块memory是CPU和GPU都能读写的，并不是在GPU中使用vertex data的最优方法，而最快的内存拥有的properties是VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT的，因为我们还需要在GPU中，再进行一次数据的搬移。

因此是在GPU上进行copy，因此所有的commands都需要记录到command buffer中，在正式下达draw call，可以先熟悉一下command buffer recording的过程，

// VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
// usage=VK_BUFFER_USAGE_TRANSFER_SRC_BIT
VkBuffer stagingBuffer;
// VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
// usage=VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT
VkBuffer vertexBuffer;

VkCommandBuffer commandBuffer;
VkCommandBufferBeginInfo beginInfo{};
vkBeginCommandBuffer(commandBuffer, &beginInfo);
    VkBufferCopy copyRegion{};
    copyRegion.srcOffset = 0;
    copyRegion.dstOffset = 0;
    copyRegion.size = size;
    vkCmdCopyBuffer(commandBuffer, stagingBuffer, vertexBuffer, 1, &copyRegion);
vkEndCommandBuffer(commandBuffer);

VkSubmitInfo submitInfo{};
vkQueueSubmit(graphicsQueue, 1, &submitInfo, VK_NULL_HANDLE);

最后将copy命令提交queue中以便GPU执行。

总结

准备vertex data的过程，就是不断将数据在CPU和GPU之间传输，以便GPU以最优的方式使用这些vertx data，其中涉及resource的创建和内存的管理，是Vulkan中比较核心的部分之一。

传输vertex data

绘制

到现在为止，所有的设置和准备工作都已经完成，那么下边就要进行图形的绘制工作了。即，在CPU中向GPU下发各种绘制命令。在vulkan中进行渲染工作，需要一些固定的模式：

获取swapchain中可用的image；
准备开启command buffers，它们记录需要提交给GPU的commands；
准备提交所需的信息并完成提交；

获得swapchain中的image

因为这里我们需要将渲染的结果直接输出到屏幕中显示，因此需要得到swapchain中可用的image作为渲染目标。其实，每一次的渲染过程并不会一定会输出到最终的屏幕上，有可能作为中间结果被后续的渲染操作使用，但是本文所述的任务很简单，所以直接输出到屏幕中。

vkAcquireNextImageKHR支持完成这样的操作。其中的imageIndex是vkGetSwapchainImagesKHR返回的可用image数组中的索引。当返回可用的image时，也可以触发相应的semaphore和fence。

vkAcquireNextImageKHR(device, swapChain, UINT64_MAX, imageAvailableSemaphore, VK_NULL_HANDLE, &imageIndex);

准备command buffer

我们前面已经介绍了，command buffer是从command pool中分配而来。所有提交到GPU执行的命令都必须要记录在command buffer中，值得说的是，command buffer也分为两种：

primary command buffer;
secondary command buffer;
其中，只有primary command buffer才能被提交到queue，而secondary command buffer必须通过primary command buffer才能被提交执行。

command buffer是存在生命周期的，具体来说，如下所示：

command buffer的生命周期

当每次allocate完成后，command buffer处于initial;
之后begin command buffer之后，从initial变成recording；
end command buffer之后，从recording变成executable，在这个状态下才可以被提交给queue去执行;
submmit之后，状态从executable变成pending，说明提交的任务处于执行中；
在下达绘制命令中，我们其实就是在begin和end之间放入相应的commands，此时全部处于recording。具体过程如下：

// 从recording变成initial
vkResetCommandBuffer(commandBuffer, 0);
VkCommandBufferBeginInfo beginInfo{};
vkBeginCommandBuffer(commandBuffer, &beginInfo);
VkRenderPassBeginInfo rpInfo{};
    vkCmdBeginRenderPass(commandBuffer, &rpInfo, ...);
        vkCmdBindPipeline(commandBuffer, ...);
        vkCmdBindDescriptorSets(commandBuffer, ...);
        vkCmdBindVertexBuffers(commandBuffer, ...);
        vkCmdSetViewport(commandBuffer, ...);
        vkCmdSetScissor(commandBuffer, ...);
        vkCmdDraw(commandBuffer, ...);
    vkCmdEndRenderPass(commandBuffer);
vkEndCommandBuffer(commandBuffer);

其中，NVIDIA的一个ppt中说明了这个过程：

command buffer的recording过程

其中的工作主要分为几类：

设置command buffer包括reset/begin/end等；
记录renderpass的开始和结束，因为在vulkan中所有的渲染任务都是在renderpass中进行的；
绑定pipeline，从而知道各种stages的states;
设置各种dynamic states（这些可以增加pipeline中的对应设置）；
绑定descriptorsets，以便shader读取内存资源；
绑定vertex buffer，确定输出数据；
下发绘制命令vkCmdDraw等；

准备提交

当将所有的commands全都记录下来后，我们就需要将任务提交到queue中执行，这个过程其实很简单，只需要：

VkSubmitInfo submitInfo{};
vkQueueSubmit(graphicsQueue, 1, &submitInfo, ...);

这里想要说明的是，submitInfo都有啥？即，GPU需要知道啥才能完成所有的图形渲染任务。

Queue的submit informations

在这里，我们发现执行一次vkQueueSubmit，并不是只有会提交一个command buffer，而是一批，每个submitinfo中都有若干个command buffers。

除了command buffers，另外最重要的信息是同步原语，比如semaphore和fence等，用来在不同的粒度上协同host与device，command buffers之间的同步关系，这一部分内容也是vulkan种比较重要和困难的部分。

呈现到屏幕上

我们之前从swapchain中得到可用的image，等待所有的命令都执行完了，就要将渲染的结果呈现到屏幕中了，如下：

VkPresentInfoKHR presentInfo{};
vkQueuePresentKHR(presentQueue, &presentInfo);

整体总结

到目前为止，上述内容完整叙述了通过Vulkan完成绘制三角形的“简单”步骤，其实是过于复杂了。因此刚入门的开发者可能经常会困惑于各种概念，因为我也遇到了，因此vulkan对于图形学开发入门来说不是很友好，其中很多的概念是借鉴自OpenGL，相比于Vulkan，OpenGL可能中文的材料更多，而且社区更活跃。但是，vulkan这种“繁杂”的设计，才能进一步通过细粒度的控制来挤压硬件的性能，随着摩尔定律处于失效的边缘，后续可能会是主要的发展方向。

alex Li

https://limeya.github.io/2023/02/12/ji-suan-ji-tu-xing-xue/vulkan/shi-yong-vulkan-hui-zhi-yi-ge-san-jiao-xing/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 alex Li !

Vulkan 三角形 device queue swapchain

以libtinyxml为例了解C++的make使用

本文以libtinyxml为例，初步学习了makefile和静态链接库和动态链接库等内容

2023-04-13 编程之道

make makefile 静态链接库动态链接库

《制造消费者》的社会带来的挑战

介绍了消费主义的全球发展历史，从商品社会的形成、发展到对于人类生活行为方式的影响。

2023-01-08 读书感悟

经济社会消费

使用vulkan绘制一个三角形

初始化

检查并开启validation layers(optional)

检查开启extensions(optional)

创建Instance

查询Physical devices

创建logical devices

查询可用的queue

检查可用的validation layer和extensions

总结

Presentation(optional)

开启presentation使用的extensions

检查并创建window和surface

创建用于presentation的queue

创建swapchain

获取swapchain中的image并创建对应的imageview

总结

准备renderpass

添加attachments

设置renderpass的subpass

创建framebuffer

总结

创建descriptor set

设置descriptor layout

创建descriptor pool

分配descriptor set

总结

创建pipeline

构建shader modules

设置 fixed stages的状态

创建pipeline layout

完成pipeline创建

总结

准备command buffer

创建command pool

分配command buffer

总结

准备vertex data

创建host memory

创建device memory

memory mapping from device to host

创建device-local memory

总结

绘制

获得swapchain中的image

准备command buffer

准备提交

呈现到屏幕上

整体总结