Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Miwa Katayama is active.

Publication


Featured researches published by Miwa Katayama.


Three-dimensional TV, video, and display. Conference | 2004

Algorithm for dynamic 3D object generation from multi-viewpoint images

Kimihiro Tomiyama; Yutaka Orihara; Miwa Katayama; Yuichi Iwadate

In this paper, we present a high-resolution dynamic 3D object generating method from multi-viewpoint images. This dynamic 3D object can display fine images of the moving human body form arbitrary viewpoints, and consists of subjects 3D model generated for each video frame. To create a high-resolution dynamic 3D object, we propose a 3D-model-generation method from multi-viewpoint images. The method uses stereo matching to refine an approximate 3D model obtained by the volume intersection method. Furthermore, to reproduce high-resolution textures, we have developed a new technique which obtains the visibility of vertices and polygons of 3D models. A modeling experiment performed with 19 fire-wire cameras confirmed that the proposed method effectively generates high-resolution dynamic 3D objects.


International Journal of Computer Vision | 2011

3D Archive System for Traditional Performing Arts

Kensuke Hisatomi; Miwa Katayama; Kimihiro Tomiyama; Yuichi Iwadate

We developed a 3D archive system for Japanese traditional performing arts. The system generates sequences of 3D actor models of the performances from multi-view video by using a graph-cuts algorithm and stores them with CG background models and related information. The system can show a scene from any viewpoint as follows; the 3D actor model is integrated with the background model and the integrated model is projected to a viewpoint that the user indicates with a viewpoint controller.A challenge of generating the actor models is how to reconstruct thin or slender parts. Japanese traditional costumes for performances include slender parts such as long sleeves, fans and strings that may be manipulated during the performance. The graph-cuts algorithm is a powerful 3D reconstruction tool but it tends to cut off those parts because it uses an energy-minimization process. Hence, the search for a way to reconstruct such parts is important for us to preserve these arts for future generations. We therefore devised an adaptive erosion method that works on the visual hull and applied it to the graph-cuts algorithm to extract interior nodes in the thin parts and to prevent the thin parts from being cut off. Another tendency of the reconstruction method using the graph-cuts algorithm is over-shrinkage of the reconstructed models. This arises because the energy can also be reduced by cutting inside the true surface. To avoid this tendency, we applied a silhouette-rim constraint defined by the number of the silhouette-rims passing through each node.By applying the adaptive erosion process and the silhouette-rim constraint, we succeeded in constructing a virtual performance with costumes including thin parts. This paper presents the results of the 3D reconstruction using the proposed method and some outputs of the 3D archive system.


conference on visual media production | 2011

Depth Estimation from Three Cameras Using Belief Propagation: 3D Modelling of Sumo Wrestling

Kensuke Ikeya; Kensuke Hisatomi; Miwa Katayama; Yuichi Iwadate

We propose a method to estimate depth from three wide-baseline camera images using belief propagation. With this method, message propagation is restricted to reduce the effects of boundary overreach, and max and min values and kurtosis of message energy distribution are used to reduce errors caused by large occlusion and texture less areas. In experiments, we focused on scenes of the traditional Japanese sport of sumo and created 3D models from three HD images using our method. We displayed them on a 3D display using the principle of integral photography (IP). We confirmed from the experimental results that our method was effective for estimating depth.


international conference on computer vision | 2009

Method of 3D reconstruction using graph cuts, and its application to preserving intangible cultural heritage

Kensuke Hisatomi; Kimihiro Tomiyama; Miwa Katayama; Yuichi Iwadate

We are developing an archive system that can preserve Japanese traditional dramatic arts, such as “Noh”, in the form of dynamic 3D models. Dynamic 3D models are models that are generated from video images captured by multiple cameras surrounding a target object for each frame. The archive system can provide an entire Noh scene from any viewpoint by synthesizing dynamic 3D models with a computer graphics model of a Noh stage.


electronic imaging | 2008

A method for converting three-dimensional models into auto-stereoscopic images based on integral photography

Miwa Katayama; Yuichi Iwadate

We have been researching three-dimensional (3D) reconstruction from images captured by multiple cameras. Currently, we are investigating how to convert 3D models into stereoscopic images. We are interested in integral photography (IP), one of many stereoscopic display systems, because the IP display system reconstructs complete 3D auto-stereoscopic images in theory. This system consists of a high-resolution liquid-crystal panel and a lens array. It enables users to obtain a perspective view of 3D auto-stereoscopic images from any direction. We developed a method for converting 3D models into IP images using the OpenGL API. This method can be applied to normal CG objects because the 3D model is described in a CG format. In this paper, we outline our 3D modeling method and the performance of an IP display system. Then we discuss the method for converting 3D models into IP images and report experimental results.


IEEE Transactions on Circuits and Systems for Video Technology | 2017

Depth Estimation Using an Infrared Dot Projector and an Infrared Color Stereo Camera

Kensuke Hisatomi; Masanori Kano; Kensuke Ikeya; Miwa Katayama; Tomoyuki Mishina; Yuichi Iwadate; Kiyoharu Aizawa

This paper proposes a method of estimating depth from two kinds of stereo images: color stereo images and infrared stereo images. An infrared dot pattern is projected on a scene by a projector so that infrared cameras can capture the scene textured by the dots and the depth can be estimated even where the surface is not textured. The cost volumes are calculated for the infrared and color stereo images for each frame and are extended in the time direction to define a spatiotemporal cost volume (st-cost volume). We also extend the cost volume filter in the time direction by modifying the cross-based local multipoint filter (CLMF) and applying it to the st-cost volumes in order to restrain flicker on the time-varying depth maps. To get a reliable cost volume, the infrared and color st-cost volumes are integrated into a single cost volume by selecting the cost of either the infrared or the color st-cost volumes according to the size of the adaptive kernel used for the CLMF. Then, a graphcut is executed on the cost volume in order to estimate the disparity robustly even when the baselines of the stereo cameras are set wide enough to ensure spatially high resolution in the depth direction and the shapes of blocks are deformed by the affine transformation. A 2D graphcut is executed on each scan line to reduce the processing time and memory consumption. We experimented with the proposed method using infrared color stereo data sets of scenes in the real world and evaluated its effectiveness by comparing it with other recent stereo matching methods and depth cameras.


conference on visual media production | 2013

Depth estimation by cost volume with spatial-temporal cross-based local multipoint filter using projecting infrared patterns

Kensuke Hisatomi; Kensuke Ikeya; Miwa Katayama; Yuichi Iwadate; Kiyoharu Aizawa

This paper proposes a robust depth-estimation method that projects infrared dot-patterns in order to estimate depth maps of a low-texture dynamic scene. Two infrared cameras are utilized to observe the projected infrared patterns, and the depth maps are estimated by stereo matching of the patterns. The stereo matching makes use of a cost volume with a cross-based local multipoint filter (CLMF) which is an edge-preserving smoothing filter using adaptive kernels. The adaptive kernel is a window that is adaptively decided by selecting pixels of similar color. In this paper, CLMF is extended (st-CLMF) beyond the spatial dimension to the temporal dimension. The proposed method is evaluated using scenes including low-texture regions. The experimental results show that st-CMLF can perform accurate depth estimations.


international conference on 3d vision | 2015

Depth Estimation Based on an Infrared Projector and an Infrared Color Stereo Camera by Using Cross-Based Dynamic Programming with Cost Volume Filter

Kensuke Hisatomi; Masanori Kano; Kensuke Ikeya; Miwa Katayama; Tomoyuki Mishina; Kiyoharu Aizawa

This paper presents a method to estimate a depth map using an infrared projector and a pair of infrared color cameras that can capture infrared and color images simultaneously. The infrared projector projects a dot pattern so that the cameras capture infrared images of a scene textured by the dots with which depths to surfaces in the scene can be estimated regardless of whether they have visible textures. Cost volumes are calculated for each of the infrared and color stereo images and are processed with a cost volume filter that smoothes each cost map by a cross-based local multipoint filter. The filtered infrared and color cost volumes are then integrated into a single cost volume by selecting either the infrared or color cost for each pixel according to the size of the adaptive kernel used for the cross-based local multipoint filter. This improves the accuracy where the adaptive kernel is small. We propose a method that can find optimal local curved surfaces of adaptive kernels from the cost volume by using three dynamic programmings. We experimented this depth estimation method on real-world datasets that the infrared color stereo cameras captured. We also used it for color stereo matching and showed that it works with normal color stereo cameras as well.


PROCEEDINGS OF THE ITE CONVENTION 2002 ITE WINTER ANNUAL CONVENTION | 2002

An experiment of 3D model estimation for generation of a VRML animation from multi-view HDTV images

Miwa Katayama; Kimihiro Tomiyama; Yuichi Iwadate; Hiroyuki Imaizumi

1.はじめに 任意の視点からの映像生成の目的で複数のハイビジョ ンカメラを被写体の周囲に配置して撮影した多視点画像を 用いて、被写体の3次元形状を推定し、3次元形状モデル を VRML で記述して表示する実験を行っている[1]。本稿 では、3次元形状モデル生成においてブロックマッチング 法と視体積交差法を併用することで3次元形状の推定結果 が向上し、生成したアニメーションの画質が改善されること を報告する。 2. 3次元形状モデル生成手順 3次元形状モデルの生成を以下の三つの手順により行 う。 2.1 被写体の全周映像撮影 図1に示すように9台のハイビジョンカメラを被写体を取 り囲むように配置する。円周の半径は 2.5m、カメラ間隔は ほぼ等間隔とする。被写体領域をクロマキー処理により簡 単に取得するため、背景にブルーバックを用いている。ク ロマキー処理により被写体領域を各カメラについて求め る。 2.2 3次元モデルの生成 カメラから被写体までの距離推定手法として視体積交差 法とブロックマッチング法を併用する。視体積交差法では、 被写体の概形を低い計算コストで求めることができるが、 曲面形状の中でも特に凹面となる形状は正しく推定するこ とができない。一方、ブロックマッチング法では、凹面を含 めた曲面についても距離推定が可能であるが、計算コスト が大きい。 そこで、被写体領域から視体積交差法により求めた被写 体の概形を探索範囲としてブロックマッチング法を適用す ることにより、高速で精度の高い距離推定を実現している。 各カメラ位置で画素ごとに推定した被写体までの距離値 から離散的マーチングキューブ法により被写体の表面形状 を表す三角パッチを得る。三角パッチの各頂点には、画像 の中で対応する画素の色情報を与える。 2.3 VRML ブラウザによるアニメーション表示 表面形状を表す三角パッチモデルは1フレームごとに VRML の IndexedFaceSet 形状ノードにより記述する。表 示する VRML データを JavaScript により切り替えてアニ メーション表示をする。動画表示を行いながらマウス操作 などによりスムーズに視点位置を移動して表示することが 可能である。 3. 3次元形状推定実験結果の比較 図2は推定した3次元形状に基づき、視点位置を移動し て生成される画像を1フレーム分表示したものである。3次 元形状の推定精度を比較するため、実カメラを配置してい ない被写体の真上に視点位置を設定した。被写体の腕か ら胸の部分は撮影用のカメラからは凹面形状となるため、 (a)の視体積交差法のみの3次元形状モデルでは形状が 正しく推定できず画質が劣化しているが、ブロックマッチン グ法を併用することで(b)のように形状が正しく推定され画 質も向上する。アニメーション表示をすると(a)では推定誤 差がフレーム毎に変化するため画像劣化が顕著となる。 4.むすび 視体積交差法とブロックマッチング法を併用して3次元形 状を推定することで、VRML アニメーションの画質が向上 することを示した。今後は生成するアニメーションの画質向 上の観点から3次元形状推定の精度や処理速度を改善す る手法の検討を進めていく。 参考文献 [1]片山他,“多視点ハイビジョンカメラによる任意視点映像生成シス テムの試作”,信学総大D-11-160, Mar. 2002. 〒157-8510 東京都世田谷区砧 1-10-11 TEL:03-5494-3193


ITCom 2002: The Convergence of Information Technologies and Communications | 2002

A system for generating arbitrary viewpoint images from HDTV multi-camera images

Miwa Katayama; Yuichi Iwadate; Kimihiro Tomiyama; Hiroyuki Imaizumi

In this paper, we propose a system for generating arbitrary viewpoint images. The system is based on image measurement and consists of three steps: HDTV image recording, modeling from images and displaying arbitrary viewpoint images. The model data is converted to VRML models. In order to estimate 3D shapes, we developed a new modeling algorithm using the block matching method as well as the volume intersection method. The proposed algorithm achieves fast and precise modeling. It is confirmed that the derived human model with motion can be smoothly played in a VRML browser on a PC, and the viewing position of the observer can be changed by mouse control.

Collaboration


Dive into the Miwa Katayama's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge