Full Papers

Geometry and Modeling

Memory-Efficient Bijective Parameterizations of Very-Large-Scale Models

Chunyang Ye, Jian-Ping Su, Ligang Liu, Xiao-Ming Fu

As high-precision 3D scanners become more and more widespread, it is easy to obtain very-large-scale meshes that contain at least millions of vertices. However, it is very challenging to process these very-large-scale meshes due to memory limitations. This paper focuses on one fundamental geometric processing task, i.e., bijective parameterization construction. To this end, we present a spline-enhanced method to compute bijective and low distortion parameterizations for very-large-scale disk topology meshes. Instead of computing descent directions using the mesh vertices as variables, we estimate descent directions for each vertex by optimizing a proxy energy defined in spline spaces. Since the spline functions contain a small set of control points, it significantly decreases the requirement on memory. Besides, a divide-and-conquer method is proposed to obtain bijective initializations and a submesh-based optimization strategy is developed to further reduce distortion. The capability and feasibility of our method are demonstrated over various complex models. Compared to the existing methods for bijective parameterization of very-large-scale meshes, our method exhibits better scalability and requires much less memory.

Practical Fabrication of Discrete Chebyshev Nets

Hao-Yu Liu, Zhongyuan Liu, Zheng-Yu Zhao, Ligang Liu, Xiao-Ming Fu

We propose a computational and practical technique to allow home users to fabricate discrete Chebyshev nets for various 3D models. The success of our method relies on two key components. The first one is a novel and simple method to approximate discrete integrable, unit-length, and angle-bounded frame fields, which are used for modeling discrete Chebyshev nets. Central to our field generation process is an alternating algorithm that takes turns executing one pass to enforce integrability and another pass to approach unit length while bounding angles. The second is a practical fabrication specification. The discrete Chebyshev net is first partitioned into a set of patches to facilitate manufacturing. Then, each patch is assigned a specification on how to pull, bend, and fold to fit the nets. We demonstrate the capability and feasibility of our method in various complex models.

A Deep Residual Network for Restoration of High-quality Geometric Details

Zhongping Ji, Chengqin Zhou, Qiankan Zhang, Yuwei Zhang, Wenping Wang

In this paper, we formulate the restoration of geometric details as a constrained optimization problem from a geometric perspective. Instead of directly solving this optimization problem, we propose a data-driven method to learn a residual mapping function. We design a geometric detail restoration network (GDRNet) to eliminate the rounding errors effectively. As the key to addressing the problem, we adopt a ResNet-based network structure and a normal-based loss function. Extensive experimental results demonstrate that accurate reconstructions can be achieved effectively using our algorithm. Our method can be used as a relief compressed representation and a important step for high-quality displacement mapping.

Robust Computation of 3D Apollonius Diagrams

Peihui Wang, Na Yuan, Yuewen Ma, Shiqing Xin, Ying He, Shuangmin Chen, Jian Xu, Wenping Wang

Apollonius diagrams, also known as additively weighted Voronoi diagrams, are an extension of Voronoi diagrams, where the weighted distance is defined based on the Euclidean distance minus the weight. The bisectors of Apollonius diagrams have a hyperbolic form, which is fundamentally different from traditional Voronoi diagrams and power diagrams. Though robust solvers are available for computing a 2D Apollonius diagram, there is no solver for the 3D version. In this paper, we system-atically analyze the structural features of a 3D Apollonius diagram, and then develop a fast algorithm. Our algorithm consists of three phases, i.e., vertex location, edge tracing and face extraction, among which the key step is to adaptively subdivide the initial large box into a set of sufficiently small boxes such that each box contains at most one Apollonius vertex. Finally, we use the tool of 2D centroidal Voronoi tessellation (CVT) to yield well-tessellated triangle meshes of the curved bisectors. Extensive experimental results validate the effectiveness and robustness of our algorithm. We also present an interesting application in centroidal Apollonius diagram.

Image-Driven Furniture Style for Interactive 3D Scene Modeling

Tomer Weiss, Ilkay Yildiz, Nitin Agarwal, Esra Ataer-Cansizoglu, Jae-Woo Choi

Creating realistic styled spaces is a complex task, which involves design know-how for what furniture pieces go well together. Interior style follows abstract rules involving color, geometry and other visual elements. Following such rules, users manually select similar-style items from large repositories of 3D furniture models, a process which is both laborious and time-consuming. We propose a method for fast-tracking style-similarity tasks, by learning a furniture’s style-compatibility from interior scene images. Such images contain more style information than images depicting single furniture. To understand style, we train a deep learning network on a classification task. Based on image embeddings that we extract from our network, we measure a furniture’s stylistic compatibility with other furnitures. We demonstrate our method with several 3D model style-compatibility results, and with an interactive system for modeling style-consistent scenes.

Physics-based Material Animation

Adjustable Constrained Soft-Tissue Dynamics

Bohan Wang, Mianlun Zheng, Jernej Barbic

Physically based simulation is often combined with geometric mesh animation to add realistic soft-body dynamics to virtual characters. This is commonly done using constraint-based simulation whereby a soft-tissue simulation is constrained to geometric animation of a subpart (or otherwise proxy representation) of the character. We observe that standard constraint-based simulation suffers from an important flaw that limits the expressiveness of soft-body dynamics. Namely, under correct physics, the frequency and amplitude of soft-tissue dynamics arising from constraints (“inertial amplitude”) are coupled, and cannot be adjusted independently merely by adjusting the material properties of the model. This means that the space of physically based simulations is inherently limited and cannot capture all effects typically expected by computer animators. For example, animators need the ability to adjust the frequency, inertial amplitude, gravity sag and damping properties of the virtual character, independently from each other, as these are the primary visual characteristics of the soft-tissue dynamics. We demonstrate that independence can be achieved by transforming the equations of motion into a non-inertial reference coordinate frame, then scaling the resulting inertial forces, and then converting the equations of motion back to the inertial frame. Such scaling of inertia makes it possible for the animator to set the character’s inertial amplitude independently from frequency. We also provide exact controls for the amount of character’s gravity sag, and the damping properties. In our examples, we use linear blend skinning and pose-space deformation for geometric mesh animation, and the Finite Element Method for soft-body constrained simulation, but our idea of scaling inertial forces is general and applicable to other animation and simulation methods. We demonstrate our technique on several character examples.

Learning Elastic Constitutive Material and Damping Models

Bin Wang, Paul Kry, Yuanmin Deng, Baoquan Chen, Uri Ascher, Hui Huang

We propose a computational and practical technique to allow home users to fabricate discrete Chebyshev nets for various 3D models. The success of our method relies on two key components. The first one is a novel and simple method to approximate discrete integrable, unit-length, and angle-bounded frame fields, which are used for modeling discrete Chebyshev nets. Central to our field generation process is an alternating algorithm that takes turns executing one pass to enforce integrability and another pass to approach unit length while bounding angles. The second is a practical fabrication specification. The discrete Chebyshev net is first partitioned into a set of patches to facilitate manufacturing. Then, each patch is assigned a specification on how to pull, bend, and fold to fit the nets. We demonstrate the capability and feasibility of our method in various complex models.

Fracture Patterns Design for Anisotropic Models with the Material Point Method

Wei Cao, Luan Lyu, Xiaohua Ren,  Bob Zhang,  Zhixin Yang, Enhua Wu

Physically plausible fracture animation is a challenging topic in computer graphics. Most of the existing approaches focus on the fracture of isotropic materials. We proposed a frame-field method for the design of anisotropic brittle fracture patterns. In this case, the material anisotropy is determined by two parts: anisotropic elastic deformation and anisotropic damage mechanics. For the elastic deformation, we reformulate the constitutive model of hyperelastic materials to achieve anisotropy by adding additional energy density functions in particular directions. For the damage evolution, we propose an improved phase-field fracture method to simulate the anisotropy by design a second-order structural tensor. These two parts can present elastic anisotropy and fractured anisotropy respectively, or they can be well coupled together to exhibit rich crack effects. To ensure the flexibility of simulation, we further introduce a frame-field concept to assist in setting local anisotropy, similar to the fiber orientation of textiles. For the discretization of the deformable object, we adopt a novel Material Point Method(MPM) according to its fracture-friendly nature. We also give some design criteria for anisotropic models through comparative analysis. Experiments show that our anisotropic method is able to be well integrated with the MPM scheme for simulating the dynamic fracture behavior of anisotropic materials.

A Novel Plastic Phase-Field Method for Ductile Fracture with GPU Optimization

Zipeng Zhao, Kemeng Huang, Chen Li, Changbo Wang, Hong QIN

In this paper, we articulate a novel plastic phase-field (PPF) method that can tightly couple the phase-field with plastic treatment to efficiently simulate ductile fracture with GPU optimization. At the theoretical level of physically-based modeling and simulation, our PPF approach assumes the fracture sensitivity of the material increases with the plastic strain accumulation. As a result, we first develop a hardening-related fracture toughness function towards phase-field evolution. Second, we follow the associative flow rule and adopt a novel degraded von Mises yield criterion. In this way, we establish the tight coupling of the phase-field and plastic treatment, with which our PPF method can present distinct elastoplasticity, necking, and fracture characteristics during ductile fracture simulation. At the numerical level towards GPU optimization, we further devise an advanced parallel framework, which takes the full advantages of hierarchical architecture. Our strategy dramatically enhances the computational efficiency of preprocessing and phase-field evolution for our PPF with the material point method (MPM). Based on our extensive experiments on a variety of benchmarks, our novel method’s performance gain can reach 1.56x speedup of the primary GPU MPM. Finally, our comprehensive simulation results have confirmed that this new PPF method can efficiently and realistically simulate complex ductile fracture phenomena in 3D interactive graphics and animation.

Physics and Graphics

Simulation of Arbitrarily-shaped Magnetic Objects

Seung-wook Kim, JungHyun Han

We propose a novel method for simulating rigid magnets in a stable way. It is based on analytic solutions of the magnetic vector potential and flux density, which ensure that the magnetic forces and torques calculated using them hardly diverge. Therefore, our magnet simulations remain robust even though magnets are in close proximity or penetrate each other. Thanks to the robustness, our method can simulate magnets of any shapes represented in 3D polygon meshes. Another strength of our method is that the time complexities for computing the magnetic forces and torques are significantly reduced, compared to the previous methods. Our method is easily integrated with classic rigid-body simulators. The experiment results presented in this paper prove the robustness and efficiency of our method.

Semi-analytical Solid Boundary Conditions for Free Surface Flows

Yue Chang, Shusen Liu, Xiaowei He, Sheng Li, Guoping Wang

The treatment of solid boundary conditions remains one of the most challenging parts in the SPH method. We present a semi-analytical approach to handle complex solid boundaries of arbitrary shape. Instead of calculating a renormalizing factor for the particle near the boundary, we propose to calculate the volume integral inside the solid boundary under the local spherical frame of a particle. By converting the volume integral into a surface integral, a computer aided design (CAD) mesh file representing the boundary can be naturally integrated for particle simulations. To accelerate the search for a particle’s neighboring triangles, a uniform grid is applied to store indexes of intersecting triangles. The new semi-analytical solid boundary handling approach is integrated into a position-based method [MM13] as well as a projection-based [HWW20] to demonstrate its effectiveness in handling complex boundaries. Experiments show that our method is able to achieve comparable results with those simulated using ghost particles. In addition, our method shows better performance, especially for large-scale boundaries, and is flexible enough to handle complex solid boundaries, including sharp corners and shells.

Cosserat Rod with rh-adaptive Discretization

Jiahao Wen, Chen Jiong, Nobuyuki Umetani, Hujun Bao, Jin Huang

Rod-like one dimensional elastic objects often exhibit complex behaviors which poses great challenges to discretization method for pursuing an accurate simulation. By only moving a small part of material points, the Eulerian-on-Lagrangian (EoL) method already shows great adaptivity to handle sharp contact, but it is still far from enough to reproduce rich and complex geometry details arising in simulation. In this paper, we extend the discrete configuration space by unifying all Lagrangian and EoL nodes via assigning every sample with an extra changeable material coordinate for even more adaptivity. However, this great extension will immediately bring in much more redundancy for dynamical system. Therefore, we further propose an additional energy to control the spatial distribution of all material points, seeking to equally space them with respect to a curvature based function as a monitor. This flexible approach can effectively constrain the motion of material points to resolve numerical degeneracy, while at the same time allows them to notably slide inside the parametric domain to account for shape parameterization. In addition, to accurately response to sharp contact, our method can also insert or remove nodes online and adjust the energy stiffness to suppress possible jittering artifacts that could be excited in a stiff system. As a result of this hybrid $rh$-adaption, our proposed method is capable of reproducing many realistic rod dynamics, such as excessive bending, twisting and knotting while only using a limited number of elements.

Rendering

Fast Out-of-Core Octree Generation for Massive Point Clouds

Markus Schütz, Stefan Ohrhallinger, Michael Wimmer

We propose an efficient out-of-core octree generation method for arbitrarily large point clouds. It utilizes a hierarchical counting sort to quickly split the point cloud into small chunks, which are then processed in parallel. Levels of detail are generated by subsampling the full data set from the bottom up using one of multiple exchangeable sampling strategies. We introduce a fast hierarchical approximate blue-noise strategy and compare it to a uniform random sampling strategy. The throughput, including out-of-core acces to disk, generating the octree, and writing the final result to disk, is about an order of magnitude faster than the state of the art, and reaches up to around 6 million points per second for the blue-noise approach and up to around 9 million points per second for the uniform random approach on modern SSDs.

Real time multiscale rendering of dense dynamic stackings

Élie Michel, Tamy Boubekeur

Dense dynamic aggregates of similar elements are frequent in natural phenomena and challenging to render under full real time constraints. The optimal representation to render them changes drastically depending on the distance at which they are observed, ranging from sets of detailed textured meshes for near views to point clouds for distant ones. Our multiscale representation use impostors to achieve the mid-range transition from mesh-based to point-based scales. To ensure a visual continuum, the impostor model should match as closely as possible the mesh on one side, and reduce to a single pixel response that equals point rendering on the other. In this paper, we propose a model based on rich spherical impostors, able to combine precomputed as well as dynamic procedural data, and offering seamless transitions from close instanced meshes to distant points. Our approach is architectured around an on-the-fly discrimination mechanism and intensively exploits the rough spherical geometry of the impostor proxy. In particular, we propose a new sampling mechanism to reconstruct novel views from the precomputed ones, together with a new conservative occlusion culling method, coupled with a two-pass rendering pipeline leveraging early-Z rejection. As a result, our system scales well and is even able to render sand, while supporting completely dynamic stackings.

Automatic Band-Limited Approximation of Shaders Using Mean-Variance Statistics in Clamped Domain

Shi Li, Rui Wang, Huo Yuchi, Wenting Zheng, Wei Hua, Hujun Bao

In this paper, we present a new shader smoothing method to improve the quality and generality of band-limiting shader programs. Previous work treats intermediate values in the program as random variables, and utilizes mean and variance statistics to smooth shader programs. In this work, we extend the band-limiting framework by exploring the observation that one intermediate value in the program is usually computed by a complex composition of functions, where the domain and range of composited functions heavily impact the statistics of smoothed programs. Accordingly, we propose three new shader smoothing rules for the specific composition of functions by considering the domain and range, which provide better mean and variance statistics of approximations. Aside from continuous functions, the texture, such as color texture or normal map, can be treated as a discrete function with limited domain and range, thereby can be processed similarly in the newly proposed framework. Experiments show that compared with previous work, our method is capable of generating better smoothness of shader programs as well as handling a broader set of shader programs.

Lights and Ray Tracing

Unsupervised Image Reconstruction for Gradient-Domain Volumetric Rendering

Zilin Xu, Qiang Sun, Lu Wang, Yanning Xu, Beibei Wang

Gradient-domain rendering can highly improve the convergence of light transport simulation using the smoothness in image space. These methods generate image gradients and solve a image reconstruction problem with rendered image and the gradient images. Recently, a previous work proposed a gradient-domain volumetric photon density estimation for homogeneous participating media. However, the image reconstruction relies on traditional L1 reconstruction, which leads to obvious artifacts when only a few rendering passes are performed. Deep learning based reconstruction methods have been exploited for surface rendering, but they are not suitable for volume density estimation. In this paper, we propose an unsupervised neural network for image reconstruction of gradient-domain volumetric photon density estimation, more specifically for volumetric photon mapping, using a variant of GradNet with an encoded shift connection and a separated auxiliary feature branch, which includes volume based auxiliary features such as transmittance and photon density. Our network smooths the images in global scale and preserves the high frequency details in small scale. We demonstrate that our network produces higher quality result, compared to previous work. Although we only considered volumetric photon mapping, it’s straightforward to extend our method for other forms, like beam radiance estimation.

Next Event Estimation++: Visibility Mapping for Efficient Light Transport Simulation

Jerry Jinfeng Guo, Martin Eisemann, Elmar Eisemann

Monte-Carlo rendering techniques require to estimate visibility between distinct points in a scene as the most common and compute intense operation in order to establish valid light paths between camera and light source. Unfortunately, many of these test fail due to occlusion and the corresponding paths do not contribute to the final image. In this work we present next event estimation++ (NEE++): a visibility mapping technique to perform visibility tests in a more informed way by caching voxel to voxel visibility probabilities. We show two scenarios with this regard: Russian roulette style rejection sampling of visibility tests and direct importance sampling of the visibility. We show applications to next event estimation and light sampling in a uniform path tracer, and light subpaths sampling in Bi-Directional Path Tracing. The technique is simple to implement, trivial to add to existing rendering systems, and comes at almost no cost, as the required information can be directly extracted from the rendering process itself. It discards up to 80% of visibility tests on average while reducing variance by ~20% compared to other state-of-the-art light sampling techniques with the same number of samples. It gracefully handles even complex scenes with efficiency similar to Metropolis light transport techniques but with a more uniform convergence.

Two-stage Resampling for Bidirectional Path Tracing with Multiple Light Sub-paths

Kosuke Nabata, Kei Iwasaki, Yoshinori Dobashi

Recent advances on bidirectional path tracing (BPT) reveal that the use of multiple light sub-paths and the resampling of a small number of light sub-paths from them can improve the efficiency of BPT. Increasing the number of pre-sampled light sub-paths can better explore the possibility of generating light paths with large contributions and can alleviate the correlation of light paths due to the reuse of the pre-sampled light sub-paths by all eye sub-paths. The increase in the pre-sampled light sub-paths, however, also incurs the high computational cost. In this paper, we propose a two-stage resampling method for BPT to efficiently handle a large number of pre-sampled light sub-paths. We also derive the weighting function that can treat the change in the path probability due to the two-stage resampling. Our method can handle two orders of magnitude larger number of pre-sampled light sub-paths than the previous methods in equal-time rendering, resulting in stable and better noise reduction than the state-of-the-art method.

Materials and Shading Models

Computing the Bidirectional Scattering of a Microstructure Using Scalar Diffraction Theory and Path Tracing

Viggo Falster, Adrián Jarabo, Jeppe Revall Frisvad

Most models for bidirectional surface scattering by arbitrary explicitly defined microgeometry are either based on geometric optics and include multiple scattering but no diffraction effects or based on wave optics and include diffraction but no multiple scattering effects. The few exceptions to this tendency are based on rigorous solution of Maxwell’s equations and are computationally intractable for surface microgeometries that are tens or hundreds of microns wide. We set up a measurement equation for combining results from single scattering scalar diffraction theory with multiple scattering geometric optics using Monte Carlo integration. Since we consider an arbitrary surface microgeometry, our method enables us to compute expected bidirectional scattering of the meta-surfaces with increasingly smaller details seen more and more often in production. In addition, we can take a measured microstructure as input and, for example, compute the difference in bidirectional scattering between a desired surface and a produced surface. In effect, our model can account for both diffraction colors due to wavelength-sized features in the microgeometry and brightening due to multiple scattering.

Procedural Physically-based BRDF for Real-Time Rendering of Glints

Xavier Chermain, Basile Sauvage, Jean-Michel Dischler, Carsten Dachsbacher

Physically based rendering of glittering surfaces is a challenging problem in computer graphics. Several methods have proposed off-line solutions, but none is dedicated to high-performance graphics. In this work, we propose the first physically based BRDF for real-time rendering of glints. Our model can reproduce the appearance of sparkling materials (rocks, rough plastics, glitter fabrics, etc.). Compared to the previous real-time method [Zirr and Kaplanyan 2016], which is not physically based, our BRDF uses normalized NDFs and converges to the standard microfacet BRDF [Cook and Torrance 1982] for a large number of microfacets. Our method procedurally computes NDFs with hundreds of sharp lobes. It relies on a dictionary of 1D marginal distributions: at each location two of them are randomly picked and multiplied (to obtain a NDF), rotated (to increase the variety), and scaled (to control standard deviation/roughness). The dictionary is multiscale, does not depend on roughness, and has a low memory footprint (less than 1 MiB).

A Bayesian Inference Framework for Procedural Material Parameter Estimation

Yu Guo, Milos Hasan, Lingqi Yan, Shuang Zhao

Procedural material models have been gaining traction in many applications thanks to their flexibility, compactness, and easy editability. We explore the inverse rendering problem of procedural material parameter estimation from photographs, presenting a unified view of the problem in a Bayesian framework. In addition to computing point estimates of the parameters by optimization, our framework uses a Markov Chain Monte Carlo approach to sample the space of plausible material parameters, providing a collection of plausible matches that a user can choose from, and efficiently handling both discrete and continuous model parameters. To demonstrate the effectiveness of our framework, we fit procedural models of a range of materials—wall plaster, leather, wood, anisotropic brushed metals and layered metallic paints—to both synthetic and real target images.

Recognition

SRF-Net: Spatial Relationship Feature Network for Tooth Point Cloud Classification

Qian Ma, Guangshun Wei, Yuanfeng Zhou, Xiao Pan, Shiqing Xin, Wenping Wang

3D scanned point cloud data of teeth is popular used in digital orthodontics. The classification and semantic labelling for point cloud of each tooth is a key and challenging task for planning dental treatment. Utilizing the priori ordered position information of tooth arrangement, we propose an effective network for tooth model classification in this paper. The relative position and the adjacency similarity feature vectors are calculated for tooth 3D model, and combine the geometric feature into the fully connected layers of the classification training task. For the classification of abnormal teeth, we present a label correction method to improve the classification accuracy. We also use FocalLoss as the loss function to solve the sample imbalance of wisdom teeth. The extensive evaluations, ablation studies and comparisons demonstrate that the proposed network can classify tooth models accurately and automatically and outperforms state-of-the-art point cloud classification methods.

Semi-Supervised 3D Shape Recognition via Multimodal Deep Co-training

Mofei Song, Yu Liu, Xiaofan Liu

3D shape recognition has been actively investigated in the field of computer graphics. With the rapid development of deep learning, various deep models have been introduced and achieved remarkable results. Most 3D shape recognition methods are supervised and learn only from the large amount of labeled shapes. However, it is expensive and time consuming to obtain such a large training set. In contrast to these methods, this paper studies a semi-supervised learning framework to train a deep model for 3D shape recognition by using both labeled and unlabeled shapes. Inspired by the co-training algorithm, our method iterates between model training and pseudo-label generation phases. In the model training phase, we train two deep networks based on the point cloud and multi-view representation simultaneously. In the pseudo-label generation phase, we generate the pseudo-labels of the unlabeled shapes using the joint prediction of two networks, which augments the labeled set for the next iteration. To extract more reliable consensus information from multiple representations, we propose an uncertainty-aware consistency loss function to combine the two networks into a multimodal network. This not only encourages the two networks to give similar predictions on the unlabeled set, but also eliminates the negative influence of the large performance gap between the two networks. Experiments on the benchmark ModelNet40 demonstrate that, with only 10% labeled training data, our approach achieves competitive performance to the results reported by supervised methods.

The Layerizing VoxPoint Annular Convolutional Network for 3D Shape Classification

Tong Wang, Wenyuan Tao, Chung-Ming Own, Xiantuo Lou, Yuehua Zhao

Analyzing the geometric and semantic properties of 3D point cloud data via the deep learning networks is still challenging due to the irregularity and sparsity of samplings of their geometric structures. In our study, the authors combine the advantage of voxels and point clouds by presenting a new data form of voxel models, called Layer-Ring data. This data type can retain the fine description of the 3D data, and keep the high efficiency of feature extraction. After that, based on the Layer-Ring data, a modern network architecture, called VoxPoint Annular Network (VAN), works on the Layer-Ring data for the feature extraction and object category prediction. The design idea is based on the edge-extraction and the coordinate representation for each voxel on the separated layer. With the flexible design, our proposed VAN can adapt to the layer’s geometric variability and scalability. Finally, the extensive experiments and comparisons demonstrate that our approach obtained the reveal results with the state-of-the-art methods on a variety of standard benchmark datasets (e.g., ModelNet10, ModelNet40). Moreover, the tests also proved that 3D shape features could learn efficiently and robustly.

SRNet: A 3D Scene Recognition Network using Static Graph and Dense Semantic Fusion

Zhaoxin Fan, Hongyan Liu, Jun He, qi sun, Xiaoyong Du

Point cloud based 3D scene recognition is a fundamental task for real world applications such as Simultaneous Localization and Mapping (SLAM). However, most of existing methods do not take full advantage of context semantic features of scenes. Besides, their recognition abilities are severely affected by dynamic noise, such as points of cars and pedestrians in the scene. To tackle these issues, we propose a new Scene Recognition Network, namely SRNet, where we leverage recent advanced point cloud reasoning methods to learn discriminative features for finding similar scenes. Specifically, to learn local features without dynamic noise, we propose a Static Graph Convolution (SGC) layer. And several SGC layers are then stacked as our backbone. Next to further avoid dynamic noise, a Spatial Attention Module (SAM) is developed to make the feature descriptor pay more attention to immovable local areas that are more relevant to our task. And finally, in order to make a more meaningful sense of the scene, we design a Dense Semantic Fusion (DSF) strategy to integrate multi-level features during feature propagation, which helps the model deepen its understanding of contextual semantics of scenes. By utilizing these designs, SRNet can map scenes to discriminative and generalizable feature vectors, which are then used for finding matching-pairs, thus recognition can be achieved. Experimental studies demonstrate that SRNet achieves new state-of-the-art on scene recognition and shows good generalization ability to other point cloud based tasks.

A Graph-based One-Shot Learning Method for Point Cloud Recognition

Zhaoxin Fan, Hongyan Liu, Jun He, qi sun, Xiaoyong Du

Point cloud based 3D vision tasks such as 3D object recognition are critical to many real world applications such as autonomous driving. Many point cloud processing models based on deep learning have been proposed by researchers recently. However, they are all large-sample dependent, which means that a large amount of manually labelled training data are needed to train the model, resulting in huge labor cost. In this paper, to tackle this problem, we propose a One-Shot learning model for Point Cloud Recognition, namely OS-PCR. Different from previous methods, our method formulates a new setting where the model only needs to see one sample per class once for memorizing at inference time when new classes need to be recognized. To fulfill this task, we design three modules in the model: an Encoder Module, an Edge-conditioned Graph Convolution Network Module, and a Query Module. To evaluate the performance of the proposed model, we build a one-shot learning benchmark dataset for 3D point cloud analysis. Then, comprehensive experiments are conducted on it to demonstrate the effectiveness of our proposed model.

Human Pose

Human Pose Transfer by Adaptive Hierarchical Deformation

Jinsong Zhang, Xingzi Liu, Kun Li

Human pose transfer, as a misaligned image generation task, is very challenging. Existing methods cannot effectively utilize the input information, which often fail to preserve the style and shape of hair and clothes. In this paper, we propose an adaptive human pose transfer network with two hierarchical deformation levels. The first level generates human semantic parsing aligned with the target pose, and the second level generates the final textured person image in the target pose with the semantic guidance. To avoid the drawback of vanilla convolution that treats all the pixels as valid information, we use gated convolution in both two levels to dynamically select the important features and adaptively deform the image layer by layer. Our model has very few parameters and is fast to converge. Experimental results demonstrate that our model achieves better performance with more consistent hair, face and clothes with fewer parameters than state-of-the-art methods. Furthermore, our method can be applied to clothing texture transfer.

Personalized Hand Modeling from Multiple Postures with Multi-view Color images

Yangang Wang, Ruting Rao, Changqing Zou

Personalized hand models can be utilized to synthesize high quality hand dataset, provide more possible training data for deep learning and improve the accuracy of hand pose estimation. In recent years, parameterized hand models, e.g., MANO, are widely used for obtaining personalized hand models. However, due to the low resolution of existing parameterized hand models, it is still hard to obtain high-fidelity personalized hand models. In this paper, we propose a new method for estimating personalized hand models from multiple hand postures with multi-view color images. The personalized hand model is represented by a personalized neutral hand, and hand postures. We propose a novel optimization strategy to estimate the neutral hand from multiple hand postures. To demonstrate the performance of our method, we have built a multi-view system and captured more than 35 person, and each of them has 30 hand postures. We hope our estimated hand models can boost the high-fidelity parameterized hand models in the future. All the hand models will be public available in our web page.

Monocular Human Pose and Shape Reconstruction with Part Differentiable Rendering

Min Wang, Feng Qiu, Wentao Liu, Chen Qian, Xiaowei Zhou, Lizhuang Ma

Superior human pose and shape reconstruction from monocular images depends on removing the ambiguities caused by occlusions and shape variance. Recent works succeed in regression-based methods, that estimate parametric models directly through a deep neural network which is supervised by 3D ground truth. However, 3D ground truth is neither abundance nor effortless obtained. In this paper, we introduce body part segmentation as critical supervision. Part segmentation indicates not only the shape of each body part but helps to infer the occlusions between parts as well. To improve the reconstruction with part segmentation, we propose a part-level differentiable renderer that enables part-based models to be supervised by part segmentation in neural networks or optimization loops. We also introduce a general parametric model engaged in the rendering pipeline as an intermediate representation between skeletons and detailed shapes, which consists of primitive geometries for better interpretability. The proposed approach combines parameter regression, body model optimization, and detailed model registration altogether. Experimental results demonstrate that the proposed method achieves balanced evaluation on pose and shape, and outperforms the state-of-the-art approaches on Human3.6M and LSP datasets.

PointSkelCNN: Deep Learning-based 3D Human Skeleton Extraction from Point Clouds

Hongxing Qin, Songshan Zhang, Baoquan Chen

3D human skeletons plays an important roles in human shape reconstruction, human animation and so on. Recently, remarkable advances have been achieved in 3D human skeleton estimation from color images and depth images via the powerful DCNNs. However, it still remains challenging to apply deep learning frameworks for 3D human skeleton extraction from point clouds because of the sparsity of point clouds and the highly nonlinearity of human skeleton regression. In this paper, we develop a deep learning-based approach for 3D human skeleton extraction from point clouds. we convert 3D human skeleton extraction into offset vector regression and human body segmentation via deep learning-based point cloud contraction. Furthermore, a disambiguation strategy is adopted to improve the robustness of the joints regression. Experiments on the public human pose dataset UBC3V and the human point cloud skeleton dataset 3DHumanSkeleton provided by ourselves show that the proposed approach outperforms the state-of-the-art methods.

FAKIR : An algorithm for revealing the anatomy and pose of statues from raw point sets

Tong Fu, Raphaelle Chaine, Julie Digne

3D acquisition of archaeological artefacts has become an essential part of cultural heritage research for preservation or restoration purpose. Statues, in particular, have been at the center of many projects. In this paper, we introduce a way to improve the understanding of acquired statues representing real or imaginary creatures by registering a simple and pliable articulated model to the raw point set data. Our approach performs a Forward And bacKward Iterative Registration (FAKIR) which proceeds joint by joint, needing only a few iterations to converge. We are thus able to detect the pose and elementary anatomy of sculptures, with possibly non realistic body proportions. By adapting our simple skeleton, our method can work on animals, and imaginary creatures.

Tracking and Saliency

Learning Target-Adaptive Correlation Filters for Visual Tracking

Ying She, Yang Yi, Jialiang Gu

Correlation filters (CF) achieve excellent performance in visual tracking but suffer from undesired boundary effects. A significant amount of approaches focus on enlarging search regions to make up for this shortcoming. However, this introduces excessive background noises and misleads the filter into learning from the ambiguous information. In this paper, we propose a novel target-adaptive correlation filter (TACF) that incorporates context and spatial-temporal regularizations into the CF framework, thus learning a more robust appearance model in the case of large appearance variations. Besides, it can be effectively optimized via the alternating direction method of multipliers(ADMM), thus achieving a global optimal solution. Finally, an adaptive updating strategy is presented to discriminate the unreliable samples and alleviate the contamination of these training samples. Extensive evaluations on OTB-2013, OTB-2015, VOT-2016, VOT-2017 and TC-128 datasets demonstrate that our TACF is very promising for various challenging scenarios compared with several state-of-the-art trackers, with real-time performance of 20 frames per second(fps).

An Occlusion-aware Edge-Based Method for Monocular 3D Object Tracking using Edge Confidence

Huang Hong, Fan Zhong, Xueying Qin

We propose an edge-based method for monocular 6DOF pose tracking of rigid objects using a monocular RGB camera. One of the critical problem for edge-based methods is to search the object contour points in the image corresponding to the known 3D model points. However, previous methods often produce false object contour points in cases of cluttered backgrounds and partial occlusions. In this paper, We propose a novel edge-based 3D objects tracking method to tackle this problem. To search the object contour points, foreground and background clutter points are first filtered out using edge color cues, then optimal object contour points are searched by maximizing their edge confidence which combines edge color and distance cues. Furthermore, the edge confidence is integrated into the edge-based cost function for pose optimization, so as to reduce the influence of false contour points caused by cluttered backgrounds and partial occlusions. We also extend our method to multi-object tracking which can handle mutual occlusions. We compare our method to the state-of-the-art methods on challenging public datasets. Experiments demonstrate that our method improves robustness and accuracy against cluttered backgrounds and partial occlusions.

Coarse to Fine: Weak Feature Boosting Network for Salient Object Detection

Chenhao Zhang, Shanshan Gao, Xiao Pan, Yuting Wang, Yuanfeng Zhou

Salient object detection is to identify objects or regions with maximum visual recognition in an image, which brings significant help and improvement to many computer visual processing tasks. Although lots of methods have occurred for salient object detection, the problem is still not perfectly solved especially when the background scene is complex or the salient object is small. In this paper, we propose a novel Weak Feature Boosting Network (WFBNet) for the salient object detection task. In the WFBNet, we extract the unpredictable regions (low confidence regions) of the image via a polynomial function and enhance the features of these regions through a well-designed weak feature boosting module (WFBM). Starting from a coarse saliency map, we gradually refine it according to the boosted features to obtain the final saliency map, and our network does not need any post-processing step. We conduct extensive experiments on five benchmark datasets using comprehensive evaluation metrics. The results show that our algorithm has considerable advantages over the existing state-of-the-art methods.

Vision Meets Graphics

Generating High-quality Superpixels in Textured Images

Zhe Zhang, Panpan Xu, Jian Chang, Wencheng Wang, Chong Zhao, Jian Jun Zhang

Superpixel segmentation is important for promoting various image processing tasks. However, existing methods still have difficulties in generating high-quality superpixels in textured images, because they cannot separate textures from structures well. Though texture filtering can be adopted for smoothing textures before superpixel segmentation, the filtering would also smooth the object boundaries, and thus weaken the quality of generated superpixels. In this paper, we propose to use the adaptive scale box smoothing instead of the texture filtering to obtain more high-quality texture and boundary information. Furthermore, we design a novel distance metric to measure the distance between different pixels, which considers boundary, color and Euclidean distance simultaneously. As a result, our method can achieve high-quality superpixel segmentation in textured images without texture filtering. The experimental results demonstrate the superiority of our method over existing methods, even the learning-based methods. Benefited from using boundaries to guide superpixel segmentation, our method can also suppress noise to generate high-quality superpixels in non-textured images.

InstanceFusion: Real-time Instance-level 3D Reconstruction Using a Single RGBD Camera

Feixiang Lu, Haotian Peng, Hongyu Wu, Jun Yang, Xinhang Yang, Ruizhi Cao, Liangjun Zhang, Ruigang Yang, Bin Zhou

We present InstanceFusion, a robust real-time system to detect, segment, and reconstruct instance-level 3D objects of indoor scenes with a hand-held RGBD camera. It combines the strengths of deep learning and traditional SLAM techniques to produce visually compelling 3D semantic models. The key success comes from our novel segmentation scheme and the efficient instance-level data fusion, which are both implemented on GPU. Specifically, for each incoming RGBD frame, we take the advantages of the RGBD features, the 3D point cloud, and the reconstructed model to perform instance-level segmentation. The corresponding RGBD data along with the instance ID are then fused to the surfel-based models. In order to sufficiently store and update these data, we design and implement a new data structure using the OpenGL Shading Language. Experimental results show that our method advances the state-of-the-art (SOTA) methods in instance segmentation and data fusion by a large margin. Besides, our instance segmentation improves the precision of 3D reconstruction, especially in the loop closure. InstanceFusion system runs 20.5Hz on a consumer-level GPU, which supports a large number of augmented reality (AR) applications (e.g., 3D model registration, virtual interaction, AR map) and robot applications (e.g., navigation, manipulation, grasping). To facilitate future research and reproduce our system more easily, source code, data, and the trained model are anonymously released on Github (https://github.com/Fancomi2017/InstanceFusion).

Weakly Supervised Part-wise 3D Shape Reconstruction from Single-View RGB Images

Chengjie Niu, Yang Yu, Zhenwei Bian, Jun Li, Kai Xu

In order for the deep learning models to truly understand the 2D images for 3D geometry recovery, we argue that single-view reconstruction should be learned in a part-aware and unsupervised manner. Such models lead to more profound interpretation of the 2D images in which part-based parsing and assembling are involved. To this end, we learn a deep neural network which takes a single-view RGB image as input, and outputs a 3D shape in parts represented by 3D point clouds with an array of 3D part generators. In particular, we devise two levels of generative adversarial network (GAN) to generate shape with both correct part structure and reasonable overall shape. To enable unsupervised network training, we devise a differentiable projection module along with a self-projection loss measuring the error between the shape projection and the input image. Through qualitative and quantitative evaluations on public datasets, we show that our method achieves good performance in part-wise single-view reconstruction.

Deep Separation of Direct and Global Components from a Single Photograph under Structured Lighting

Zhaoliang Duan, James Bieron, Pieter Peers

We present a deep learning based solution for separating the direct and global light transport components from a single photograph captured under high frequency structured lighting with a co-axial projector-camera setup. We employ an architecture with one encoder and two decoders that shares information between the encoder and the decoders, as well as between both decoders to ensure a consistent decomposition between both light transport components. Furthermore, our deep learning separation approach does not require binary structured illumination, allowing us to utilize the full resolution capabilities of the projector. Consequently, our deep separation network is able to achieve high fidelity decompositions for lighting frequency sensitive features such as subsurface scattering and specular reflections. We evaluate and demonstrate our direct and global separation method on a wide variety of synthetic and captured scenes.

Image Restoration

Pixel-wise Dense Detector for Image Inpainting

Ruisong Zhang, Weize Quan, Baoyuan Wu, Zhifeng Li, Dongming Yan

Recent GAN-based image inpainting approaches adopt an average strategy to discriminate the generated image and output a scalar, which inevitably loses the position information of visual artifacts. Moreover, the adversarial loss and reconstruction loss (e.g., l1 loss) are combined with tradeoff weights, which are also difficult to tune. In this paper, we propose a novel detection-based generative framework for image inpainting, which adopts the min-max strategy in an adversarial process. The generator follows an encoder-decoder architecture to fill the missing regions, and the detector using weakly supervised learning localizes the position of artifacts in a pixel-wise manner. Such position information makes the generator pay attention to artifacts and further enhance them. More importantly, we explicitly insert the output of the detector into the reconstruction loss with a weighting criterion, which balances the weight of the adversarial loss and reconstruction loss automatically rather than manual operation. Experiments on multiple public datasets show the superior performance of the proposed framework.

CLA-GAN: A Context and Lightness Aware Generative Adversarial Network for Shadow Removal

Ling Zhang, Chengjiang Long, Qingan Yan, Xiaolong Zhang, Chunxia Xiao

In this paper, we propose a novel context and lightness aware Generative Adversarial Network (CLA-GAN) framework for shadow removal, which refines a coarse result to the final shadow removal result in a coarse-to-fine fashion. At the refinement stage, we first obtain a lightness map using a encoder-decoder structure. With the lightness map and the coarse result as the inputs, the following encoder-decoder tries to refine the final result. Specially, different from current methods solely exploring the pixel-based feature from shadow images, we embed a contextual aware module into the refinement stage, which exploits the patch-based feature. The embedded module transfers the feature from non-shadow regions to shadow regions to ensure the consistency in appearance in the recovered shadow-free images. As considering the patch, the module can additionally enhance the spatial association and continuity around neighboring pixels. To make the model pay more attention to shadow regions during training, we use dynamic weights in the loss function. Moreover, we augment the inputs of the discriminator by rotating images in different degrees and use rotation adversarial loss during training, which can make the discriminator more stable and robust. Extensive experiments demonstrate the validity of the components in our CLA-GAN framework. Quantitative evaluation on different shadow datasets clearly shows the advantages of our CLA-GAN over the state-of-the-art methods.

Not All Areas Are Equal: A Novel Separation-Restoration-Fusion Network for Raindrops Removal

Dongdong Ren, Jinbao Li, Meng Han, Minglei Shu

Detecting and removing raindrops from an image while keeping the high quality of image details has attracted tremendous studies, but remains a challenging task due to the inhomogeneity of the degraded region and the complexity of the degraded intensity. In this paper, we get rid of the dependence of deep learning on image-to-image translation and propose a separation-restoration-fusion network for raindrops removal. Our key idea is to recover regions of different damage levels individually, so that each region achieves the optimal recovery result, and finally fuse the recovered areas together. In the region restoration module, to complete the restoration of a specific area, we propose a multi-scale feature fusion global information aggregation attention network to achieve global to local information aggregation. In addition, we also design an inside and outside dense connection dilated network, to ensure the fusion of the separated regions and the fine restoration of the image. The qualitatively and quantitatively evaluation are conducted to evaluate our method with the latest existing methods. The result demonstrates that our method outperforms state-of-the-art methods by a large margin on the benchmark datasets in extensive experiments.

SCGA-Net: Skip Connections Global Attention Network for Image Restoration

Dongdong Ren, Jinbao Li, Meng Han, Minglei Shu

Deep convolutional neural networks (DCNN) have shown their advantages in the image restoration tasks. However, most existing DCNN-based methods still suffer from residual corruptions and coarse textures. In this paper, we propose a general framework “Skip Connections Global Attention Network” to focus on the semantics delivery from shallow layers to deep layers for low-level vision tasks including image dehazing, image denoising, and low-light image enhancement. First of all, by applying dense dilated convolution and multi-scale feature fusion mechanism, we establish a novel encoder-decoder network framework to aggregate large-scale spatial context and enhance feature reuse. Secondly, the solution we proposed for skipping connection uses attention mechanism to constraint information, thereby enhancing the high-frequency details of feature maps and suppressing the output of corruptions. Finally, we also present a novel attention module dubbed global constraint attention, which could effectively captures the relationship between pixels on the entire feature maps, to obtain the subtle differences among pixels and produce an overall optimal 3D attention maps. Extensive experiments demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods in image dehazing, image denoising, and low-light image enhancement.

Image Manipulation

Diversifying Semantic Image Synthesis and Editing via Class- and Layer-wise VAEs

Yuki Endo, Yoshihiro Kanamori

Semantic image synthesis is a process for generating photorealistic images from a single semantic mask. To enrich the diversity of multimodal image synthesis, previous methods have controlled the global appearance of an output image by learning a single latent space. However, a single latent code is often insufficient for capturing various object styles because object appearance depends on multiple factors. To handle individual factors that determine object styles, we propose a class- and layer-wise extension to the variational autoencoder (VAE) framework that allows flexible control over each object class at the local to global levels by learning multiple latent spaces. Furthermore, we demonstrate that our method generates images that are both plausible and more diverse compared to state-of-the-art methods via extensive experiments with real and synthetic datasets on three different domains. We also show that our method enables a wide range of applications in image synthesis and editing tasks.

Simultaneous Multi-Attribute Image-to-Image Translation Using Parallel Latent Transform Networks

Senzhe Xu, Yu-Kun Lai

Image-to-image translation has been widely studied. Since real-world images can often be described by multiple attributes, it is useful to manipulate them at the same time. However most methods focus on transforming between two domains, and when they chain multiple single attribute transform networks together, the results are affected by the order of chaining, and the performance drops with the out-of-domain issue for intermediate results. Existing multi-domain transfer methods mostly manipulate multiple attributes by adding a list of attributes labels to the network feature, but they also suffer from interference of different attributes, and perform worse when multiple attributes are manipulated. We propose a novel approach to multi-attribute image-to-image translation using several parallel latent transform networks, where multiple attributes are manipulated in parallel and simultaneously, which eliminates both issues. To avoid the interference of different attributes, we introduce a novel soft independence constraint for the changes caused by different attributes. Extensive experiments show that our method outperforms state-of-the-art methods.

Interactive Design and Preview of Colored Snapshots of Indoor Scenes

Qiang Fu, Hai Yan, Hongbo Fu, Xueming Li

This paper presents an interactive system for quickly designing and previewing colored snapshots of indoor scenes. Different from high-quality 3D indoor scene rendering, which often takes several minutes to render a moderately complicated scene under a specific color theme with high-performance computing devices, and needs high-performance computing devices, our system aims at improving the effectiveness of color theme design of indoor scenes and employs an image colorization approach to efficiently obtain high-resolution snapshots with editable colors. Given several pre-rendered, multi-layer, gray images of the same indoor scene snapshot, our system is designed to colorize and merge them into a single colored snapshot. Our system also assists users in assigning colors to certain objects/components and infers more harmonious colors for the unassigned objects based on pre-collected priors to guide the colorization. The quickly generated snapshots of indoor scenes provide previews of interior design schemes with different color themes, making it easy to determine the personalized design of indoor scenes. To demonstrate the usability and effectiveness of this system, we present a series of experimental results on indoor scenes of different types, and compare our method with a state-of-the-art method for indoor scene material and color suggestion and offline/online rendering software packages.

A Multi-Person Selfie System via Augmented Reality

Jie Lin, chuan-kai yang

In recent years, selfie has become so popular, thus making selfie stick popular as well, as it can be used to capture images of oneself or even group of persons when needed. However, no matter how experienced a user is, the distortion shown in the captured photo due to the limited length of a selfie stick still remains a serious problem. We propose a technique, based on modifying existing augmented reality technology, to support the selfie of multiple persons, through properly aligning different photographing processes. It can be shown that our technique not only avoids the common distortion drawback of using a selfie stick, but also facilitates the composition process of a group photo. In addition, the proposed technique can also be used to create some special effect, such as an illusion of having multiple appearances of a person.

Multi-scale Information Assembly for Image Matting

Qiao, Yu; Liu, Yuhao; zhu, qiang; Yang, Xin; Wang, Yuxin; Zhang, Qiang; Wei, Xiaopeng;

Image matting is a long‐standing problem in computer graphics and vision, mostly identified as the accurate estimation of the foreground in input images. We argue that the foreground objects can be represented by different‐level information, including the central bodies, large‐grained boundaries, refined details, etc. Based on this observation, in this paper, we propose a multi‐scale information assembly framework (MSIA‐matte) to pull out high‐quality alpha mattes from single RGB images. Technically speaking, given an input image, we extract advanced semantics as our subject content and retain initial CNN features to encode different‐level foreground expression, then combine them by our well‐designed information assembly strategy. Extensive experiments can prove the effectiveness of the proposed MSIA‐matte, and we can achieve state‐of‐the‐art performance compared to most existing matting networks.

Stylized Graphics

StyleProp: Real-time Example-based Stylization of 3D Models

Filip Hauptfleisch, Ondřej Texler, Aneta Texler, Jaroslav Křivánek, Daniel Sýkora

We present a novel approach to real-time non-photorealistic rendering of 3D models of which appearance is specified by a single hand-drawn exemplar. We employ guided patch-based synthesis to achieve high visual quality as well as temporal coherence. However, unlike previous techniques that maintain consistency in one dimension (temporal domain) in our approach, multiple dimensions are taken into account to cover all degrees of freedom given by the available space of interactions (e.g., camera rotations). To enable real-time experience, we pre-calculate sparse latent representation of the entire interaction space from which the stylized image can be generated in real-time even on a mobile device. To the best of our knowledge, the proposed system is the first that enables interactive example-based stylization of 3D models with full temporal coherence in predefined interaction space.

Two-stage Photograph Cartoonization via Line Tracing

Simin Li, Qiang Wen, Shuang Zhao, Zixun Sun, Han Guoqiang, Shengfeng He

Cartoon is highly abstracted with clear edges, which makes it unique from the other art forms. In this paper, we focus on the essential cartoon factors of abstraction and edges, aiming to cartoonize real-world photographs like an artist. To this end, we propose a two-stage network, each stage explicitly targets at producing abstracted shading and crisp edges respectively. In the first abstraction stage, we propose a novel unsupervised bilateral flattening loss, which allows generating high-quality smoothing results in a label-free manner. Together with two other semantic-aware losses, the abstraction stage imposes different forms of regularization for creating cartoon-like flattened images. In the second stage we draw lines on the structural edges of the flattened cartoon with the fully supervised line drawing objective and unsupervised edge augmenting loss. We collect a cartoon-line dataset with line tracing, and it serves as the starting point for preparing abstraction and line drawing data. We have evaluated the proposed method on a large number of photographs, by converting them to three different cartoon styles. Our method substantially outperforms state-of-the-art methods in terms of visual quality quantitatively and qualitatively.

Colorization of Line Drawing with Empty Pupils

Kenta Akita, Yuki Morimoto, Reiji Tsuruno

Many studies have recently applied deep learning to the automatic colorization of line drawings. However, it is difficult to paint empty pupils using existing methods because the networks are trained with pupils that have edges, which are generated from color images using image processing. Most actual line drawings have empty pupils that artists must paint in. In this paper, we propose a novel network model that transfers the pupil details in a reference color image to input line drawings with empty pupils. We also propose a method for accurately and automatically coloring eyes. In this method, eye patches are extracted from a reference color image and automatically added to an input line drawing as color hints using our eye position estimation network.

Visualization and Interaction

RadEx: Integrated Visual Exploration of Multiparametric Studies for Radiomic Tumor Profiling

Eric Mörth, Kari Wagner-Larsen, Erlend Hodneland, Camilla Krakstad, Ingfrid Haldorsen, Stefan Bruckner, Noeska Smit

Better understanding of the complex processes driving tumor growth and metastases is critical for developing targeted treatment strategies in cancer. Radiomics extracts large amounts of features from medical images which enables radiomic tumor profiling in combination with clinical markers. However, analyzing complex imaging data in combination with clinical data is not trivial and supporting tools aiding in these exploratory analyses are presently missing. In this paper, we present an approach that aims to enable the analysis of multiparametric medical imaging data in combination with numerical, ordinal, and categorical clinical parameters in order to validate established and unravel novel biomarkers. We propose a hybrid approach where dimensionality reduction to a single axis is combined with multiple linked views allowing clinical experts to formulate hypotheses based on all available imaging data and clinical parameters. This may help to reveal novel tumor characteristics in relation to molecular targets for treatment, thus providing better tools for enabling more personalized targeted treatment strategies. To confirm the utility of our approach, we closely collaborate with experts from the field of gynecological cancer imaging and conducted an evaluation with six experts in this field.

Slice and Dice: A Physicalization Workflow for Anatomical Edutainment

Renata Georgia Raidou, Eduard Gröller, Hsiang-Yun Wu

During the last decades, anatomy has become an interesting topic in education—even for laymen or schoolchildren. As medical imaging techniques become increasingly sophisticated, virtual anatomical education applications have emerged. Still, anatomical models are often preferred, as they facilitate 3D localization of anatomical structures. Recently, data physicalizations, i.e., physical visualizations, have proven to be effective and engaging—sometimes, even more than their virtual counterparts. So far, medical data physicalizations involve mainly 3D printing, which is still expensive and cumbersome. We investigate alternative forms of physicalizations, which use readily available technologies (home printers) and inexpensive materials (paper or semi-transparent films) to generate crafts for anatomical edutainment. To the best of our knowledge, this is the first computer-generated crafting approach within an anatomical edutainment context. Our approach follows a cost-effective, simple, and easy-to-employ workflow, resulting into assemblable data sculptures, i.e., semi-transparent sliceforms. It primarily supports volumetric data (such as CT or MRI), but mesh data can also be imported. An octree slices-up the imported volume and an optimization step simplifies the slice configuration and proposes the optimal order for easy assembly. A packing algorithm places the resulting slices with their labels, annotations, and assembly instructions on a paper or transparent film of user-selected size, to be printed, assembled into a sliceform, and explored. We conducted a study with 10 participants, demonstrating that our approach is an initial step towards the successful creation of interactive and engaging anatomical physicalizations.


Visual Analytics in Dental Aesthetics

Aleksandr Amirkhanov, Matthias Bernhard, Alexey Karimov, Sabine Stiller, Andreas Geier, Eduard Gröller, Gabriel Mistelbauer

Dental healthcare increasingly employs computer-aided design software to provide patients with high-quality dental prosthetic devices. In modern dental reconstruction, the dental technician captures the dental impression and measures the mandibular movements of the patient, as every patient has a unique anatomy. Subsequently, the dental technician designs a custom denture that fits the patient from a functional point of view. The current workflow does not include a systematic aesthetics analysis, and dental technicians rely only on an aesthetically pleasing mock-up that they discuss with the patient and on their experience. Therefore, the final denture aesthetics remain unknown until the late point in time when the dental technicians incorporate the denture to the patient. In this work, we present a solution that integrates aesthetics analysis into the functional workflow of dental technicians. Our solution uses a video recording of the patient to preview the denture design at any stage of the denture designing process. We present a teeth pose estimation technique to enable denture preview, and a set of visualizations that support dental technicians in the aesthetic design. In particular, we employ an abstracted facial and dental proportions view to assist dental technicians in choosing the most aesthetically fitting preset from a library of dentures. We include a dental proportions view, a facial proportions view, and a harmony view to help dental technicians in identifying a suitable denture size. A dental histogram enables adjusting the denture position, and an aesthetometer and a smile-lines view provide visual feedback for fine-grain adjustments. We demonstrate the utility of our system with four use cases, conducted by a dental technician. We performed a quantitative evaluation for the teeth pose estimation, and an informal usability evaluation with a dental technician, with positive outcomes for the integration of aesthetics analysis into the functional workflow.

Short Papers

Rendering

An Energy-Conserving Hair Shading Model Based on Neural Style Transfer

Zhi Qiao and Takashi Kanai

We present a novel approach for shading photorealistic hair animation, which is the essential visual element for depicting realistic hairs of virtual characters. Our model is able to shade high-quality hairs quickly by extending the conditional Generative Adversarial Networks. Furthermore, our method is much faster than the previous onerous rendering algorithms and produces fewer artifacts than other neural image translation methods. In this work, we provide a novel energy-conserving hair shading model, which retains the vast majority of semi-transparent appearances and exactly produces the interaction with lights of the scene. Our method is effortless to implement, faster and computationally more efficient than previous algorithms.

Illumination Space: A Feature Space for Radiance Maps
Andrew Chalmers, Todd Zickler, and Taehyun Rhee

Radiance maps (RM) are used for capturing the lighting properties of real-world environments. Databases of RMs are useful for various rendering applications such as Look Development, live action composition, mixed reality, and machine learning. Such databases are not useful if they cannot be organized in a meaningful way. To address this, we introduce the illumination space, a feature space that arranges RM databases based on illumination properties. We avoid manual labeling by automatically extracting features from an RM that provides a concise and semantically meaningful representation of its typical lighting effects. This is made possible with the following contributions: a method to automatically extract a small set of dominant and ambient lighting properties from RMs, and a low-dimensional (5D) light feature vector summarizing these properties to form the illumination space. Our method is motivated by how the RM illuminates the scene as opposed to describing the textural content of the RM.

Creation and Reconstruction

A Deep Learning Based Interactive Sketching System for Fashion Images Design
Yao Li, Xiang Gang Yu, Xiao Guang Han, Nian Juan Jiang, Kui Jia, and Jiang Bo Lu

In this work, we propose an interactive system to design diverse high-quality garment images from fashion sketches and the texture information. The major challenge behind this system is to generate high-quality and detailed texture according to the user-provided texture information. Prior works mainly use the texture patch representation and try to map a small texture patch to a whole garment image, hence unable to generate high-quality details. In contrast, inspired by intrinsic image decomposition, we decompose this task into texture synthesis and shading enhancement. In particular, we propose a novel bi-colored edge texture representation to synthesize textured garment images and a shading enhancer to render shading based on the grayscale edges. The bi-colored edge representation provides simple but effective texture cues and color constraints, so that the details can be better reconstructed. Moreover, with the rendered shading, the synthesized garment image becomes more vivid.

Monocular 3D Fluid Volume Reconstruction Based on a Multilayer External Force Guiding Model
Zhiyuan Su, Xiaoying Nie, Xukun Shen, and Yong Hu

In this paper, we present a monocular 3D fluid volume reconstruction technique that can alleviate challenging parameter tuning while vividly reproducing the inflow and outflow of the video scene. To reconstruct the geometric appearance and 3D motion of the fluid in the video, we propose a multilayer external force guiding model that formulates the effect of target particles on fluid particles. This multilayer model makes the whole 3D fluid volume subject to the shape and motion of the water captured by the input video, so we can avoid tedious and laborious parameter tuning and easily balance the smoothness of the fluid volume and the details of the water surface. Besides, for the inflow and outflow of the 3D fluid volume, we construct a generation and extinction model to add or delete fluid particles according to the 3D velocity field of target particles calculated by a hybrid model of coupling SfS with optical flow. Experiments show that our method compares favorably to the state-of-the-art in terms of reconstruction quality, and is more general to the real-captured fluid. Furthermore, the reconstructed 3D fluid volume can be effectively applied to any desired new scenario.

Geomteric Computations

A Robust Feature-aware Sparse Mesh Representation
Lizeth Joseline Fuentes Perez, Luciano Arnaldo Romero Calla, Anselmo Antunes Montenegro, Claudio Mura, and Renato Pajarola

The sparse representation of signals defined on Euclidean domains has been successfully applied in signal processing. Bringing the power of sparse representations to non-regular domains is still a challenge, but promising approaches have started emerging recently. In this paper, we investigate the problem of sparsely representing discrete surfaces and propose a new representation that is capable of providing tools for solving different geometry processing problems. The sparse discrete surface representation is obtained by combining innovative approaches into an integrated method. First, to deal with irregular mesh domains, we devised a new way to subdivide discrete meshes into a set of patches using a feature-aware seed sampling. Second, we achieve good surface approximation with over-fitting control by combining the power of a continuous global dictionary representation with a modified Orthogonal Marching Pursuit. The discrete surface approximation results produced were able to preserve the shape features while being robust to over-fitting. Our results show that the method is quite promising for applications like surface re-sampling and mesh compression.

Simple Simulation of Curved Folds Based on Ruling-aware Triangulation
Kosuke Sasaki and Jun Mitani

Folding a thin sheet material such as paper along curves creates a developable surface composed of ruled surface patches. When using such surfaces in design, designers often repeat a process of folding along curves drawn on a sheet and checking the folded shape. Although several methods for constructing such shapes on a computer have been proposed, it is still difficult to check the folded shapes instantly from the crease patterns.In this paper, we propose a simple method that approximately realizes a simulation of curved folds with a triangular mesh from its crease pattern. The proposed method first approximates curves in a crease pattern with polylines and then generates a triangular mesh. In order to construct the discretized developable surface, the edges in the mesh are rearranged so that they align with the estimated rulings. The proposed method is characterized by its simplicity and is implemented on an existing origami simulator that runs in a web browser.

Using Landmarks for Near-Optimal Pathfinding on the CPU and GPU
Maximilian Reischl, Christian Knauer, and Michael Guthe

We present a new approach for path finding in weighted graphs using pre-computed minimal distance fields. By selecting the most promising minimal distance field at any given node and switching between them, our algorithm tries to find the shortest path. As we show, this approach scales very well for different topologies, hardware and graph sizes and has a mean length error below 1% while using reasonable amounts of memory. By keeping a simple structure and minimal backtracking, we are able to use the same approach on the massively parallel GPU, reducing the run time even further.

Posters and Work-In-Progress Papers

Posters

Interactive Video Completion with SiamMask
Satsuki Tsubota and Makoto Okabe

Reconstructing Monte Carlo Errors as a Blue-noise in Screen Space
Hongli Liu and Honglei Han

Day-to-Night Road Scene Image Translation Using Semantic Segmentation
Seung Youp Baek and Sungkil Lee

Work-in-Progress Papers

RTSDF: Generating Signed Distance Fields in Real Time for Soft Shadow Rendering
Yu Wei Tan, Nicholas Chua, Clarence Koh, and Anand Bhojan

Stroke Synthesis for Inbetweening of Rough Line Animations
Jiazhou Chen, Xinding Zhu, Pierre Bénard, and Pascal Barla