US20110050685A1 - Image processing apparatus, image processing method, and program - Google Patents

Image processing apparatus, image processing method, and program Download PDF

Info

Publication number
US20110050685A1
US20110050685A1 US12/859,110 US85911010A US2011050685A1 US 20110050685 A1 US20110050685 A1 US 20110050685A1 US 85911010 A US85911010 A US 85911010A US 2011050685 A1 US2011050685 A1 US 2011050685A1
Authority
US
United States
Prior art keywords
image
frame picture
input image
object area
binary mask
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/859,110
Inventor
Hideshi Yamada
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Original Assignee
Sony Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp filed Critical Sony Corp
Assigned to SONY CORPORATION reassignment SONY CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: YAMADA, HIDESHI
Publication of US20110050685A1 publication Critical patent/US20110050685A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects

Definitions

  • the present invention relates to an image processing apparatus, an image processing method, and a program and, more particularly, to an image processing apparatus that can easily create a pseudo three-dimensional image by combing an object image obtained from an input image and a binary mask image, which specifies an object area on the input image, with a planar image that simulates a picture frame or architrave, to an image processing method, and to a program.
  • a pseudo image is created by adding a depth image to a two-dimensional image rather than by supplying a three-dimensional image.
  • Japanese Unexamined Patent Application Publication No. 2008-084338 proposes a method of creating a pseudo three-dimensional image by adding relief-like depth data to texture data, which is divided into objects.
  • An algorithm of software that aids pseudo three-dimensional image creation is also proposed, according to which a user deforms or moves an object to be combined by using a mouse or another pointer to edit a shadow of a photo object or computer graphics (CG) object (see 3D-aware Image Editing for Out of Bounds Photography, Amit Shesh et al., Graphics Interface, 2009).
  • CG computer graphics
  • An image processing apparatus creates a pseudo three-dimensional image that improves depth perception of the image;
  • the image processing apparatus includes an input image acquiring means for acquiring an input image and a binary mask image that specifies an object area on the input image, a combining means for extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image, and a frame picture combining position determining means for determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and the other of the pair does not include an intersection with the boundary of the object area.
  • the quadrangular frame picture can be formed so that the edge that does not include the intersection with the boundary of the object area is longer than the edge that includes the intersection.
  • the position of the quadrangular frame picture can be determined by rotating the picture around a predetermined position.
  • the quadrangular frame picture can be formed by carrying out three-dimensional affine transformation on a predetermined quadrangular frame picture.
  • the combining means can create the combined image by continuously deforming the shape of the quadrangular frame picture and extracting the pixels in the area inside the quadrangular frame picture of the input image and the pixels in the object area, specified by the binary mask image, on the input image.
  • the combining means can create a plurality of combined images by extracting the pixels in the area inside the quadrangular frame picture, which has a plurality of types of shapes or is formed at a predetermined position, and the pixels in the object area, specified by the binary mask image, on the input image.
  • the combining means can create the combined image by storing input images or binary mask images, each of which is used to create the combined image, in correspondence to frame shape parameters, which include the rotational angle of the quadrangular frame picture, three-dimensional affine transformation parameters, and positions, by forming a frame picture with a predetermined quadrangular shape, according to the frame shape parameters stored in correspondence to a stored input image or binary mask image that is found, by comparison, to be most similar to the input image or binary mask image obtained by the input image acquiring means in the stored input images and binary mask images, and by extracting the pixels in the area inside the quadrangular frame picture of the input image and the pixels in the object area, specified by the binary mask image, on the input image.
  • frame shape parameters which include the rotational angle of the quadrangular frame picture, three-dimensional affine transformation parameters, and positions
  • An image processing method is a method for use in an image processing apparatus operable to create a pseudo three-dimensional image that improves depth perception of the image; the image processing method includes an input image acquiring step of acquiring an input image and a binary mask image that specifies an object area on the input image, a combining step of extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image, and a frame picture combining position determining step of determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and the other of the pair does not include an intersection with the boundary of the object area.
  • a program is executable by a computer that controls an image processing apparatus operable to create a pseudo three-dimensional image that improves depth perception of the image so as to execute a process including an input image acquiring step of acquiring an input image and a binary mask image that specifies an object area on the input image, a combining step of extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image, and a frame picture combining position determining step of determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and the other of the pair does not include an intersection with the boundary of the object area.
  • an input image and a binary mask image that specifies an object area on the input image are acquired, pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image are extracted to create a combined image, and a position on the combined image at which the quadrangular frame picture is placed is determined so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and the other of the pair does not include an intersection with the boundary of the object area.
  • a pseudo three-dimensional image can be easily created by combining an object image, which is obtained from an input image and a binary mask image that specifies an object area on the input image, with a planar image that simulates a picture frame or architrave.
  • FIG. 1 is a block diagram showing an example of the structure of a pseudo three-dimensional image creating apparatus in an embodiment of the present invention
  • FIG. 2 is a block diagram showing an example of the structure of the frame picture combining parameter calculator in FIG. 1 ;
  • FIG. 3 is a flowchart illustrating a pseudo three-dimensional image creation process
  • FIG. 4 shows an input image and its binary mask image
  • FIG. 5 illustrates a frame picture texture image
  • FIG. 6 illustrates three-dimensional affine transformation parameters
  • FIG. 7 illustrates three-dimensional affine transformation
  • FIG. 8 is a flowchart illustrating a frame picture combining parameter calculation process
  • FIG. 9 illustrates the frame picture combining parameter calculation process
  • FIG. 10 also illustrates the frame picture combining parameter calculation process
  • FIG. 11 shows an object layer images and a frame layer image
  • FIG. 12 shows an exemplary combined image
  • FIG. 13 illustrates a relation between a frame picture and an object image
  • FIG. 14 shows another exemplary combined image
  • FIG. 15 shows other exemplary combined images
  • FIG. 16 shows other exemplary combined images
  • FIG. 17 is a block diagram showing the structure of an example of a general-purpose personal computer.
  • FIG. 1 is a block diagram showing an example of the structure of a pseudo three-dimensional image creating apparatus in an embodiment of the present invention.
  • the pseudo three-dimensional image creating apparatus 1 in FIG. 1 combines an input image, a binary mask image, from which an object area on the input image has been cut off, and a frame picture texture image to create an image that spuriously appears to be a stereoscopic three-dimensional image.
  • the pseudo three-dimensional image creating apparatus 1 combines an image obtained by cutting off an object area from an input image according to its corresponding binary mask image with an image obtained by performing projection deformation of a frame picture texture image.
  • the pseudo three-dimensional image creating apparatus 1 has an input image acquiring unit 11 , a frame picture texture acquiring unit 12 , a three-dimensional affine transformation parameter acquiring unit 13 , a rectangular three-dimensional affine transformer 14 , a frame picture combining parameter calculator 15 , a frame picture combining unit 16 , and an output unit 17 .
  • the input image acquiring unit 11 acquires an input image and a binary mask image that specifies an object area on the input image, and supplies the acquired images to the frame picture combining parameter calculator 15 .
  • the input image is an RGB color image in red, green, and blue, for example.
  • the binary mask image has the same resolution as the input image and holds one of two values such as 1 and 0 to indicate whether the relevant pixel is included in the object area, for example.
  • the input image and binary mask image are arbitrarily selected or supplied by the user. Of course, the input image and binary mask image are made to correspond to each other.
  • the frame picture texture acquiring unit 12 acquires a texture image to be attached to a quadrangle frame picture in, for example, a square shape, and supplies the texture image to the frame picture combining unit 16 .
  • the texture image visually appears as a plane; an example of it is an image that simulates a white frame of a printed photo.
  • the three-dimensional affine transformation parameter acquiring unit 13 acquires three-dimensional affine transformation parameters, which are used in three-dimensional affine transformation performed on the frame picture texture image, and supplies these parameters to the rectangular three-dimensional affine transformer 14 .
  • the three-dimensional affine transformation parameters may be directly specified with numerals or may be arbitrarily set according to user input operations through graphical user interfaces (GUIs) such as mouse drags and scroll bars.
  • GUIs graphical user interfaces
  • the a rectangular three-dimensional affine transformer 14 calculates rectangular parameters from the three-dimensional affine transformation parameters acquired from the three-dimensional affine transformation parameter acquiring unit 13 and supplies the calculated rectangular parameters to the frame picture combining parameter calculator 15 .
  • the rectangular parameters indicate the two-dimensional coordinate of the four vertexes of the frame picture texture image after the three-dimensional affine transformation and the central position of the rectangle.
  • the aspect ratio of the original rectangle used for the transformation may be specified by the user by operating an operation unit (not shown). Alternatively, the aspect ratio of the frame picture texture image entered by operating the operation unit may be used instead.
  • the frame picture combining parameter calculator 15 calculates the positions and scales of the input image and binary mask image, supplied from the input image acquiring unit 11 , and the frame picture to be combined, and supplies frame picture parameters to the frame picture combining unit 16 together with the input image and binary mask image.
  • the frame picture parameters supplied to the frame picture combining unit 16 indicate the four two-dimensional vertex coordinates of the quadrangular frame picture in the image coordinate system. The structure of the frame picture combining parameter calculator 15 will be described later in detail with reference to FIG. 2 .
  • the frame picture combining unit 16 combines the input image, the binary mask image, and a frame shape structure image together according to the frame picture combining parameters to create a pseudo three-dimensional image on which its object visually appears to be stereoscopic, and then output the created image to the output unit 17 .
  • the frame picture combining unit 16 includes an object layer image creating unit 16 a and a frame layer image creating unit 16 b.
  • the object layer image creating unit 16 a creates an image in the object area, that is, an object layer image from the input image, binary mask image, and frame shape structure image, according to the frame picture combining parameters.
  • the frame layer image creating unit 16 b creates an image in the frame picture texture area, that is, a frame layer image from the input image, binary mask image, and frame shape structure image, according to the frame picture combining parameters.
  • the frame picture combining unit 16 combines the object layer image and frame layer image, which have been thus created, together to create a combined image, which is a pseudo three-dimensional.
  • the output unit 17 receives a combined image created as a pseudo three-dimensional image by the frame picture combining unit 16 , and outputs the received image.
  • the frame picture combining parameter calculator 15 has a mask barycenter calculator 51 , a frame picture scale calculator 52 , and a frame picture vertex calculator 53 .
  • the frame picture combining parameter calculator 15 determines constraint conditions, which are used to obtain a frame picture shape, from the binary mask image to determine the position and scale of the frame picture.
  • the mask barycenter calculator 51 obtains an average of the positions of the pixels in the object area, that is, all pixels in the binary mask image as the barycenter position. Then, the mask barycenter calculator sends the average to the frame picture scale calculator 52 .
  • the frame picture scale calculator 52 has a central position calculator 52 a, a scale calculator 52 b, and a scale deciding unit 52 c.
  • the frame picture scale calculator 52 calculates a frame picture central position P_FRAME and a scale S_FRAME from the barycenter position and a frame setting angle ⁇ g, which is an input parameter, and sends the calculated values to the frame picture vertex calculator 53 .
  • the frame picture central position P_FRAME and scale S_FRAME will be described later in detail.
  • the frame picture vertex calculator 53 receives the frame picture central position P_FRAME and scale S_FRAME from the frame picture scale calculator 52 , and outputs the four vertexes, which are frame picture combining parameters.
  • step S 11 the input image acquiring unit 11 acquires an input image and a binary mask image corresponding to the input image and then sends them to the frame picture combining parameter calculator 15 .
  • An exemplary input image and its corresponding binary mask image are respectively shown on the left and right in FIG. 4 .
  • the butterfly on the input image is an object image, so, on the binary mask image, pixels in the area in which the butterfly is displayed are displayed in white and pixels in the remaining area are displayed in black.
  • step S 12 the frame picture texture acquiring unit 12 acquires a frame picture texture image, which is selected when an operation unit (not shown) including a mouse and keyboard is operated, and sends the acquired image to the frame picture combining unit 16 .
  • An exemplary frame picture text image is shown in FIG. 5 ; the image is formed by pixels, the value of which is ⁇ .
  • the outermost edge forming a frame is set to black, the pixel value a being 0; the inner edge next to the frame is set to white, the pixel value ⁇ being 1; the central part is set to black, the pixel value ⁇ being 0. That is, the frame picture texture image in FIG. 5 is formed from black and white edges.
  • step S 13 the three-dimensional affine transformation parameter acquiring unit 13 acquires three-dimensional affine transformation parameters, which are used to carry out three-dimensional affine transformation on the frame picture texture image, when the operation unit (not shown) is operated, and sends the acquired parameters to the rectangular three-dimensional affine transformer 14 .
  • the three-dimensional affine transformation parameters are used to carry out affine transformation on a quadrangular frame picture so that the picture visually appears like a stereoscopic shape.
  • these parameters are a rotation ⁇ x around the x axis, which is in the horizontal direction, a rotation ⁇ z around the z axis, which is line of sight, a distance f from an imaging position P to the frame used as the frame picture texture, which is a subject, a distance tx traveled in the x direction, which is horizontal to the image, and a distance ty traveled in the y direction, which is perpendicular to the image.
  • step S 14 the rectangular three-dimensional affine transformer 14 receives the three-dimensional affine transformation parameters sent from the three-dimensional affine transformation parameter acquiring unit 13 , calculates rectangular parameters, and sends the calculated parameters to the frame picture combining parameter calculator 15 .
  • the rectangular three-dimensional affine transformer 14 obtains transformed coordinates by using a coordinate system, in which the central point of a rectangular frame picture is fixed to the origin (0, 0), the coordinate system being normalized to match the width in the x or y direction, whichever is longer. That is, when the rectangular frame picture is square, the rectangular three-dimensional affine transformer 14 sets the rectangular center RC and the four vertex coordinates p 0 ( ⁇ 1, ⁇ 1), p 1 (1, ⁇ 1), p 2 (1, 1), p 3 ( ⁇ 1, 1), which are taken before transformation.
  • the rectangular three-dimensional affine transformer 14 then assigns the vertex coordinates p 0 to p 3 , rectangular center RC, and three-dimensional affine transformation parameters to equation (1) to calculate vertex coordinates p 0 ′ to p 3 ′ and rectangular center RC′ transformed by three-dimensional affine transformation.
  • R ⁇ z is a rotational transformation matrix, represented by equation (2), that corresponds to a rotation ⁇ z about the z axis
  • R ⁇ x is a rotational transformation matrix, represented by equation (3), that corresponds to a rotation ⁇ x about the x axis
  • T s is a transformation matrix, represented by equation (4), that corresponds to the distances tx and ty
  • T f is a transformation matrix, represented by equation (5), that corresponds to the distances f.
  • R ⁇ z [ cos ⁇ ⁇ ⁇ z - sin ⁇ ⁇ ⁇ z 0 0 sin ⁇ ⁇ ⁇ z cos ⁇ ⁇ ⁇ z 0 0 0 0 1 0 0 0 0 1 ] ( 2 )
  • R ⁇ x [ 1 0 0 0 0 cos ⁇ ⁇ ⁇ x sin ⁇ ⁇ ⁇ x 0 0 - sin ⁇ ⁇ ⁇ x cos ⁇ ⁇ ⁇ x 0 0 0 0 1 ] ( 3 )
  • T s [ 1 0 0 tx 0 1 0 ty 0 0 1 0 0 0 0 1 ] ( 4 ) T f ⁇ [ 1 0 0 0 0 1 0 0 0 0 1 far 0 0 0 1 ] ( 5 )
  • a frame picture texture image such as an upper image in FIG. 7 , represented by the vertex coordinates p 0 to p 3 of a rectangle and its center RC, is transformed into a frame picture texture image such as a lower image in FIG. 7 , represented by the vertexes p 0 ′ to p 3 ′ of another rectangle and its center RC′.
  • a frame picture texture image such as an upper image in FIG. 7 , represented by the vertex coordinates p 0 to p 3 of a rectangle and its center RC′.
  • step S 15 the frame picture combining parameter calculator 15 executes a frame picture combining parameter calculation process to calculate frame picture combining parameters and sends the calculated parameters to the frame picture combining unit 16 .
  • the mask barycenter calculator 51 calculates the mask barycenter position BC of the shape of the object from the binary mask image, and sends the calculated barycenter position to the frame picture scale calculator 52 . Specifically, as shown in FIG. 9 , the mask barycenter calculator 51 extracts pixels with a pixel value ⁇ of 1 (pixels in white in the drawing) from all pixels in the binary mask image, which forms an object of a butterfly, and determines the average coordinates of these pixel positions as the mask barycenter position BC.
  • step S 32 the frame picture scale calculator 52 controls the central position calculator 52 a to calculate the frame picture central position P_FRAME from the mask barycenter position BC received from the mask barycenter calculator 51 and from the frame setting angle ⁇ g, which is an input parameter.
  • the central position calculator 52 a first calculates a contour point CP to determine the position of the frame picture. That is, the central position calculator 52 a obtains a vector RV, which has been rotated clockwise by the frame setting angle ⁇ g from the lower direction of the image, as shown in FIG. 9 , the lower direction being handled as a reference vector. The central position calculator 52 a further obtains, as the contour position CP, a two-dimensional position at which the pixel value a first changes from 1 to 0 during a motion from the mask barycenter position BC in the direction of the vector RV, that is, at which the contour of the object area (boundary of the object area) is first encountered, as shown in FIG. 9 .
  • the contour position CP is the central position P_FRAME of the frame picture texture.
  • step S 33 the scale calculator 52 b sets the frame picture texture image to calculate the scale S_FRAME, which is the scale of the frame picture.
  • the scale calculator 52 b rotates the frame picture texture image formed by the vertex coordinates p 0 ′ to p 3 ′ of the rectangle and its center RC′, which are obtained after three-dimensional affine transformation, by the frame setting angle ⁇ g, to update the vertex coordinates to p 0 ′′ to p 3 ′′. That is, the frame picture texture image is rotated clockwise, centered around the rectangular center RC′ and the vertex coordinates p 0 ′ to p 3 ′ are updated to the vertex coordinates p 0 ′′ to p 3 ′′.
  • the frame picture texture is disposed at the bottom of the object; if ⁇ g is 90 degrees, the frame picture texture is disposed so that it stands on the left side of the object.
  • step S 34 the scale calculator 52 b determines a longer edge LE and a shorter edge SE from the vertex coordinates p 0 ′′ to p 3 ′′ to obtain a straight line of each edge.
  • the longer edge LE is the longest edge of the frame picture texture and the shorter edge SE is the edge opposite to the longer edge LE, as shown in FIG. 10 .
  • the edge placed next to the longer edge LE is the left edge LO and the edge placed next to the shorter edge SE is the right edge L 1 .
  • the scale calculator 52 b calculates, as a longer-edge scale S_LE, a scale when the longer edge LE passes through the farthest point in the direction of the vector RV of the binary mask image. Specifically, in the case shown in FIG. 10 , the scale calculator 52 b calculates, as the longer-edge scale S_LE, the scale when the longer edge LE passes through the intersection F 1 (on the straight line T 4 ), which is the farthest point intersecting with the object image in the direction of the vector RV from the straight line T 3 , which passes through the mask barycenter position BC and is orthogonal to the vector RV. That is, when the frame picture is enlarged or reduced about the central position P_FRAME (contour point CP), the longer scale S_LE is obtained as an enlargement ratio or reduction ratio when the longer edge LE is disposed on the straight line T 4 .
  • the scale calculator 52 b calculates, as a shorter-edge scale S_SE, a scale when the shorter edge SE passes through the farthest point in the direction opposite to the direction of the vector RV of the binary mask image. Specifically, in the case shown in FIG. 10 , the scale calculator 52 b calculates, as the shorter-edge scale S_SE, the scale when the shorter edge SE passes through the intersection F 3 (on the straight line T 5 ), which is the farthest point intersecting with the object image in the direction opposite to the direction of the vector RV from the straight line T 3 , which passes through the mask barycenter position BC and is orthogonal to the vector RV. That is, when the frame picture is enlarged or reduced about the central position P FRAME (contour point CP), the shorter scale S_SE is obtained as an enlargement ratio or reduction ratio when the shorter edge SE is disposed on the straight line T 5 .
  • step S 36 the scale calculator 52 b calculates, as a left-edge scale S_L 0 , a scale when the left edge L 0 is in the direction of the vector RV relative to the straight line T 3 , which passes through the mask barycenter position BC and is perpendicular to the vector RV, and includes the intersection F 1 (on the straight line T 1 ) with the object image in the area R 0 on the left edge L 0 side relative to the straight line R 0 R that passes through the mask barycenter position BC and is parallel to the left edge L 0 and when the left edge L 0 passes through the intersection F 1 with the object image, which is at the farthest point from the straight line R 0 R that passes through the mask barycenter position BC and is parallel to the left edge L 0 .
  • the left-edge scale S_L 0 is obtained as the enlargement ratio or reduction ratio applied when the left-edge L 0 is positioned on the straight line T 1 .
  • step S 37 the scale calculator 52 b calculates, as a right-edge scale S_L 1 , a scale when the right edge L 1 is in the direction of the vector RV relative to the straight line T 3 , which passes through the mask barycenter position BC and is perpendicular to the vector RV, and includes the intersection F 2 (on the straight line T 2 ) with the object image in the area R 1 on the right edge L 1 side relative to the straight line R 1 L that passes through the mask barycenter position BC and is parallel to the right edge L 1 and when the right edge L 1 passes through the intersection F 2 with the object image, which is at the farthest point from the straight line R 1 L that passes through the mask barycenter position BC and is parallel to the right edge L 1 .
  • the right-edge scale S_L 1 is obtained as the enlargement ratio or reduction ratio applied when the right edge L 1 is positioned on the straight line T 2 .
  • step S 38 the scale deciding unit 52 c calculates the scale S_FRAME of the frame picture texture by using the longer-edge scale S_LE, shorter-edge scale S_SE, left-edge scale S_L 0 , and right-edge scale S_L 1 , according to equation (6) below.
  • which takes a value of 1 or more, is an arbitrary coefficient to adjust the size of the frame picture
  • MAX(A, B, C) is a function to select the maximum value of values A to C
  • MIN(D, E) is a function to select the minimum value of values D and E.
  • the scale deciding unit 52 c obtains the maximum value of the longer-edge scale S_LE, left-edge scale S_L 0 , and right-edge scale S_L 1 and also obtains the minimum value of the obtained maximum value and shorter-edge scale S_SE, as the scale S_FRAME of the frame picture texture.
  • the frame picture scale calculator 52 then sends the calculated scale S_FRAME and central position P_FRAME to the frame picture vertex calculator 53 .
  • step S 39 the frame picture vertex calculator 53 uses the central position P_FRAME and scale S_FRAME of the frame picture texture, which have been received from the frame picture scale calculator 52 , to perform parallel movement so that the central position RC′′ of the frame picture texture matches the central position P_FRAME, which is the barycenter position BC.
  • step S 40 the frame picture vertex calculator 53 enlarges each edge about the central position of the frame picture texture by an amount equal to the scale S_FRAME.
  • step S 41 the frame picture vertex calculator 53 obtains the two-dimensional positions FP 0 to FP 3 of the four vertexes of the enlarged frame picture texture, and then sends the obtained two-dimensional positions FP 0 to FP 3 of the four vertexes to the frame picture combining unit 16 at a later stage as the frame picture combining parameters.
  • the frame picture combining parameters can be set so that the two-dimensional coordinates of the four vertexes of the frame picture texture become optimum for the object area on the basis of the longer edge, shorter edge, left edge, and right edge of the frame picture texture and the farthest distance in the object area.
  • step S 15 the frame picture combining parameter calculation process is executed to calculate frame picture combining parameters, after which the sequence proceeds to step S 16 .
  • the frame picture combining unit 16 controls the object layer image creating unit 16 a to create an object layer image from an input image and binary mask image. Specifically, for example, the object layer image creating unit 16 a creates, in the object area, an object layer image as shown in the upper left part of FIG. 11 from a binary mask image as shown in the lower left part of FIG. 11 , the mask image being made up of pixels with the pixel value ⁇ being set to 1 and pixels with the pixel value ⁇ being set to 0 (indicating black).
  • the frame picture combining unit 16 controls the frame layer image creating unit 16 b to create a frame layer image rendered by mapping the frame picture texture image to the frame picture texture, which has undergone projection deformation by the frame picture combination parameters.
  • the frame layer image creating unit 16 b creates a binary mask image of a quadrangular frame picture, as shown in the lower-right part of FIG. 11 , according to two-dimensional vertex coordinates given as the frame picture parameters.
  • is 1, where the pixel values of the input image are output; in the other area, ⁇ is 0, where all pixel values are 0.
  • the frame layer image creating unit 16 b creates the frame layer image, as shown in the upper right part of FIG. 11 , from the input image and the created binary mask image of the frame picture.
  • step S 18 the frame picture combining unit 16 combines the object layer image and frame layer image together to create a combined pseudo three-dimensional image as shown in FIG. 12 , and sends the combined image to the output unit 17 .
  • step S 19 the output unit 17 outputs the combined pseudo three-dimensional combined image, which has been created.
  • the processes described above can thus create a pseudo three-dimensional image that uses, as depth perception of a person, an overlap of a frame picture texture image and a perspective of a rectangular object for which projection transformation has been performed.
  • depth perception can be generally attained by obtaining a clue such as perspective projection and vanishing points from a rectangle for which projection transformation has been performed.
  • a fore-and-aft relation can also be obtained from an order in which an object image and frame image overlap, as the eyesight. To have a person recognize the fore-and-aft relation represented by a perspective and overlap through the eyesight in this way, it may suffice to satisfy conditions as shown in FIG. 13 .
  • a first condition is that the edge on the far side of a frame picture, that is, the shorter edge overlaps an object and is behind the object. More specifically, the first condition is that, for example, as shown in FIG. 13 , the shorter edge of a frame picture V 2 has intersections with the boundary of an object area V 1 and only the object is displayed in the object area V 1 .
  • a second condition is that the edge on the near side of the frame picture, that is, the longer edge has no intersection with the boundary of the object area.
  • the second condition is that, for example, as shown in FIG. 13 , the longer edge of the frame picture V 2 has no intersection with the boundary of the object area V 1 .
  • a third condition is that the frame picture has a shape that can be three-dimensionally present.
  • the third condition is that the frame picture V 2 has a shape that can be three-dimensionally present.
  • the first and second conditions are satisfied by disposing the longer edge B of the frame picture V 2 , a straight line C passing through a bottom point of the object area, and the shorter edge A of the frame picture V 2 in that order from the near side, as shown in FIG. 13 . That is, it suffices that the shorter side of the frame picture V 2 has intersections with the boundary of the object area, the object is displayed between the intersections, and the shorter edge of the frame picture V 2 has no intersection with the boundary of the object area.
  • any one of the scales which have been enlarged or reduced about the central position P_FRAME so that the longer edge, shorter edge, right edge, or left edge passes its farthest point of the object area, is set as the scale S_FRAME. Accordingly, the scale of the frame picture is determined so that the longer edge has no intersection with the object area and the shorter edge has intersections with the object area.
  • a pseudo three-dimensional image can be easily created by combining an object image, which is obtained from an input image and a binary mask image that specifies an object area on the input image, with a planar image that simulates a picture frame or architrave.
  • the frame picture When the frame picture is deformed only by three-dimensional affine transformation, the frame picture can remain in a three-dimensional shape.
  • a texture is mapped to the frame picture itself by, for example, projection transformation, information usable as a clue of a perspective can be given, improving depth perception.
  • a pseudo three-dimensional image that a user can enjoy can also be created.
  • the barycenter of the object area is obtained, for example, after which, centered around the barycenter, the widths can be calculated as twice the maximum value and minimum value in the X direction of the object area, and the heights can be calculated as half the maximum value and minimum value in the Y direction.
  • a depth emphasizing effect can be obtained just by placing the frame picture behind the object.
  • the frame picture combining parameter calculator 15 can also place the frame picture upside down or oppositely, rather than on the ground, by adjusting the frame setting angle ⁇ g. Specifically, as shown in FIG. 15 , the frame picture can be placed behind the airplane-shaped toy, which is the object, or inverted parallel to the toy.
  • the frame picture combining parameter calculator 15 may also calculate the N-order moment of the binary mask image and the center of a bounding box or the center of a circumscribed circle as the parameters to calculate the frame picture shape. That is, mask image distribution may be considered for the central position instead of using a simple barycenter position.
  • the frame picture combining parameter calculator 15 may obtain the parameters to calculate the frame picture shape not only from the binary mask image but also from the input image itself. Specifically, the vanishing points of the image or the ground may be detected to determine the shape and position of the frame picture so that an edge of the frame picture is placed along a varnishing line of the input image or in a ground area. For a method of automatically detecting a varnishing line from an image, see “A new Approach for Vanishing Point Detection in Architectural Environments, Carsten Rother, BMVC2000”.
  • edges of an architectural structure are detected and the direction of parallel edges is statistically processed to calculate varnishing points.
  • Two varnishing points obtained by this method can be used to calculate the frame picture combining parameters. Specifically, the constraint that opposite edges of the frame picture converge at two different varnishing points is added in determination of the position and shape of the frame picture.
  • a projection transformation parameter f of the frame picture may also be determined by obtaining an approximate object size from object classification based on machine learning.
  • a pseudo three-dimensional image that is more naturally stereoscopic may be created by using camera parameters for macro photography when the object is small like a cup or by using camera parameters for telescopic photography when the object is large like a building.
  • camera parameters for macro photography when the object is small like a cup
  • camera parameters for telescopic photography when the object is large like a building.
  • machine learning is carried out in advance for features based on relation of local features of an object and the image if found from an image.
  • the frame picture combining parameter calculator 15 may also render an object picture to which a texture image is not mapped, during frame layer image creation.
  • a rectangle may be drawn just by specifying a color for the frame picture or the pixel colors of the input image may be drawn.
  • a user interface may be provided so that the user can correct the shape of the frame picture while viewing the pseudo three-dimensional image calculated by the frame picture combining unit 16 .
  • the user may operate the user interface to move the four vertexes of the frame picture or move the entire frame picture.
  • an interface to change the burnishing point to deform the frame picture may be provided.
  • a user input may be supplied to the three-dimensional affine transformation parameter acquiring unit 13 to directly update the frame shape parameters.
  • the frame picture combining unit 16 may deform the binary mask image itself. Specifically, when a frame picture object is combined at the bottom of an object area, specified by the binary mask image, that continuously extends to the bottom of the image, the binary mask image may be cut so that the binary mask image does not extend beyond the frame picture toward the near side, creating a pseudo three-dimensional image that is naturally stereoscopic.
  • the input image is not limited to a still image; it may be a moving image.
  • the frame picture parameters may be determined from a representative moving image frame and a mask image to determine the shape of the frame picture. To determine the shape of the frame picture, the frame picture parameters may also be determined for each moving image frame.
  • the frame picture may not be a still image; an image created by changing the three-dimensional affine transformation parameters or frame setting angle parameters may be animated.
  • the pseudo three-dimensional image creating apparatus may present pseudo three-dimensional images created by a combination of a plurality of parameters within a predetermined parameter range, and the user may select a preferable image from the presented images.
  • the frame picture combining unit 16 may use processed input images, such as blurred input images, gray-scaled images, or images with low brightness, instead of filling the areas other than the frame picture and object, that is, the background with a background color.
  • processed input images such as blurred input images, gray-scaled images, or images with low brightness
  • An alpha map or a try-map may be input as the binary mask image.
  • a plurality of three-dimensional transformation parameters may be prestored in a database, and appropriate parameters may be selected from the database and input as the three-dimensional transformation parameters acquired by the three-dimensional affine transformation parameter acquiring unit 13 .
  • the three-dimensional affine transformation parameter acquiring unit 13 creates, in advance, reference binary mask images and their three-dimensional affine transformation parameters by which the frame picture shape becomes optimum for the reference binary mask images, and stores the reference binary mask images three-dimensional affine transformation parameters in correspondence to each other.
  • the three-dimensional affine transformation parameter acquiring unit 13 selects, from the database, a reference binary mask image having a high similarity to the entered binary mask image, and acquires and outputs the three-dimensional affine transformation parameters stored in correspondence to the selected reference binary mask image.
  • the appropriate three-dimensional affine transformation parameters can be acquired from the database and can be used to deform or combine a frame picture object.
  • a feature called SIFT at a key point and an area feature called MSER are used to represent the feature of an image, and the similarity of the image is obtained by calculating the distances of these features in a feature space. That is, binary mask image features and reference binary mask image features, which are calculated in advance and stored in the database, may be obtained and compared to find an image with the largest similarity, and the three-dimensional affine transformation parameter stored in correspondence to the image may be used.
  • the similarity calculation may be carried out not only between binary mask images but also between images. That is, both the feature of the input image and the features of the binary mask image may be used together in the similarity calculation as a new feature.
  • the frame picture may be a three-dimensional object rather than a two-dimensional texture.
  • the three-dimensional object is mapped to an XY plane, and a bounding rectangle of the mapped three-dimensional object is calculated as the input rectangle.
  • the bounding rectangle is used as an ordinary two-dimensional rectangle to determine its position and scale in advance.
  • a position and scale are applied to the three-dimensional object, which is then combined with the object in the input image.
  • the object image can be combined with a curved frame or thickened frame to create a three-dimensional image for which depth perception is enhanced.
  • FIG. 17 shows an example of the structure of a general-purpose personal computer, in which a central processing unit (CPU) 1001 is included.
  • An input/output interface 1005 is connected to the CPU 1001 via a bus 1004 .
  • a read-only memory (ROM) 1002 and a random-access memory (RAM) 1003 are connected to the bus 1004 .
  • Units connected to the input/output interface 1005 are an input unit 1006 , including a keyboard, a mouse, and other input devices, through which the user enters operation commands, an output unit 1007 that outputs processing operation screens and images obtained as a result of processing to a display device, a storage unit 1008 including a hard disk drive that stores programs and various types of data, and a communication unit 1009 , including a local area network (LAN) adapter, which executes communication processing through a network typified by the Internet.
  • LAN local area network
  • a drive 1010 that writes and read data to and from a removable media 1011 such as a magnetic disc (including a flexible disc), an optical disc (including a compact disc read-only memory (CD-ROM), and a digital versatile disc (DVD)), a magneto-optical disc (including a mini-disc (MD)), or a semiconductor memory.
  • a removable media 1011 such as a magnetic disc (including a flexible disc), an optical disc (including a compact disc read-only memory (CD-ROM), and a digital versatile disc (DVD)), a magneto-optical disc (including a mini-disc (MD)), or a semiconductor memory.
  • the CPU 1001 executes various processes according to the programs that have been stored in the ROM 1002 or that are read from the removable media 1011 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory, installed in the storage unit 1008 , and loaded from the storage unit 1008 into the RAM 1003 . Data used by the CPU 1001 to execute the various processes is also stored in the RAM 1003 at appropriate times.
  • the removable media 1011 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory
  • the processes described so that they are executed in time series in the order described may include processes that are not executed in time series but in parallel or individually.

Abstract

An image processing apparatus, which creates a pseudo three-dimensional image that improves depth perception of the image, includes: an input image acquiring unit that acquires an input image and a binary mask image that specifies an object area on the input image; a combining unit that extracts pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image; and a frame picture combining position determining unit that determines a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with boundary of the object area and another of the pair does not include an intersection with the boundary of the object area.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates to an image processing apparatus, an image processing method, and a program and, more particularly, to an image processing apparatus that can easily create a pseudo three-dimensional image by combing an object image obtained from an input image and a binary mask image, which specifies an object area on the input image, with a planar image that simulates a picture frame or architrave, to an image processing method, and to a program.
  • 2. Description of the Related Art
  • In a method proposed to easily generate a three-dimensional image, a pseudo image is created by adding a depth image to a two-dimensional image rather than by supplying a three-dimensional image.
  • Japanese Unexamined Patent Application Publication No. 2008-084338, for example, proposes a method of creating a pseudo three-dimensional image by adding relief-like depth data to texture data, which is divided into objects.
  • A technique by which a pseudo three-dimensional image is created by combining an object cut from an image and a planar object together is also proposed (visit http://www.flickr.com/groups/oob/pool/).
  • An algorithm of software that aids pseudo three-dimensional image creation is also proposed, according to which a user deforms or moves an object to be combined by using a mouse or another pointer to edit a shadow of a photo object or computer graphics (CG) object (see 3D-aware Image Editing for Out of Bounds Photography, Amit Shesh et al., Graphics Interface, 2009).
  • SUMMARY OF THE INVENTION
  • In the method proposed in Japanese Unexamined Patent Application Publication No. 2008-084338, however, the user gives the center of each divided object and sets a depth, making operations complex.
  • In the technique disposed at http://www.flickr.com/groups/oob/pool/, an image processing tool in a personal computer is used to process images, so the user who actually uses the image processing tool may not easily create pseudo three-dimensional images.
  • When creating a three-dimensional image as described in 3D-aware Image Editing for Out of Bounds Photography, Amit Shesh et al., Graphics Interface, 2009, the user uses a mouse to specify the position and shape of a frame; since this operation is complex, it is important for the user to have a skill to make an exact image.
  • It is desirable to easily create a pseudo three-dimensional image by combining an object image, which is obtained from an input image and a binary mask image that specifies an object area on the input image, with a planar image that simulates a picture frame or architrave.
  • An image processing apparatus according to an embodiment of the present invention creates a pseudo three-dimensional image that improves depth perception of the image; the image processing apparatus includes an input image acquiring means for acquiring an input image and a binary mask image that specifies an object area on the input image, a combining means for extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image, and a frame picture combining position determining means for determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and the other of the pair does not include an intersection with the boundary of the object area.
  • The quadrangular frame picture can be formed so that the edge that does not include the intersection with the boundary of the object area is longer than the edge that includes the intersection.
  • The position of the quadrangular frame picture can be determined by rotating the picture around a predetermined position.
  • The quadrangular frame picture can be formed by carrying out three-dimensional affine transformation on a predetermined quadrangular frame picture.
  • The combining means can create the combined image by continuously deforming the shape of the quadrangular frame picture and extracting the pixels in the area inside the quadrangular frame picture of the input image and the pixels in the object area, specified by the binary mask image, on the input image.
  • The combining means can create a plurality of combined images by extracting the pixels in the area inside the quadrangular frame picture, which has a plurality of types of shapes or is formed at a predetermined position, and the pixels in the object area, specified by the binary mask image, on the input image.
  • The combining means can create the combined image by storing input images or binary mask images, each of which is used to create the combined image, in correspondence to frame shape parameters, which include the rotational angle of the quadrangular frame picture, three-dimensional affine transformation parameters, and positions, by forming a frame picture with a predetermined quadrangular shape, according to the frame shape parameters stored in correspondence to a stored input image or binary mask image that is found, by comparison, to be most similar to the input image or binary mask image obtained by the input image acquiring means in the stored input images and binary mask images, and by extracting the pixels in the area inside the quadrangular frame picture of the input image and the pixels in the object area, specified by the binary mask image, on the input image.
  • An image processing method according to an embodiment of the present invention is a method for use in an image processing apparatus operable to create a pseudo three-dimensional image that improves depth perception of the image; the image processing method includes an input image acquiring step of acquiring an input image and a binary mask image that specifies an object area on the input image, a combining step of extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image, and a frame picture combining position determining step of determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and the other of the pair does not include an intersection with the boundary of the object area.
  • A program according to an embodiment of the present invention is executable by a computer that controls an image processing apparatus operable to create a pseudo three-dimensional image that improves depth perception of the image so as to execute a process including an input image acquiring step of acquiring an input image and a binary mask image that specifies an object area on the input image, a combining step of extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image, and a frame picture combining position determining step of determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and the other of the pair does not include an intersection with the boundary of the object area.
  • According to an embodiment of the present invention, an input image and a binary mask image that specifies an object area on the input image are acquired, pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image are extracted to create a combined image, and a position on the combined image at which the quadrangular frame picture is placed is determined so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and the other of the pair does not include an intersection with the boundary of the object area.
  • According to the embodiments of the present invention, a pseudo three-dimensional image can be easily created by combining an object image, which is obtained from an input image and a binary mask image that specifies an object area on the input image, with a planar image that simulates a picture frame or architrave.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram showing an example of the structure of a pseudo three-dimensional image creating apparatus in an embodiment of the present invention;
  • FIG. 2 is a block diagram showing an example of the structure of the frame picture combining parameter calculator in FIG. 1;
  • FIG. 3 is a flowchart illustrating a pseudo three-dimensional image creation process;
  • FIG. 4 shows an input image and its binary mask image;
  • FIG. 5 illustrates a frame picture texture image;
  • FIG. 6 illustrates three-dimensional affine transformation parameters;
  • FIG. 7 illustrates three-dimensional affine transformation;
  • FIG. 8 is a flowchart illustrating a frame picture combining parameter calculation process;
  • FIG. 9 illustrates the frame picture combining parameter calculation process;
  • FIG. 10 also illustrates the frame picture combining parameter calculation process;
  • FIG. 11 shows an object layer images and a frame layer image;
  • FIG. 12 shows an exemplary combined image;
  • FIG. 13 illustrates a relation between a frame picture and an object image;
  • FIG. 14 shows another exemplary combined image;
  • FIG. 15 shows other exemplary combined images;
  • FIG. 16 shows other exemplary combined images; and
  • FIG. 17 is a block diagram showing the structure of an example of a general-purpose personal computer.
  • DESCRIPTION OF THE PREFERRED EMBODIMENTS Example of the Structure of a Pseudo Three-Dimensional Image Creating Apparatus
  • FIG. 1 is a block diagram showing an example of the structure of a pseudo three-dimensional image creating apparatus in an embodiment of the present invention. The pseudo three-dimensional image creating apparatus 1 in FIG. 1 combines an input image, a binary mask image, from which an object area on the input image has been cut off, and a frame picture texture image to create an image that spuriously appears to be a stereoscopic three-dimensional image.
  • More specifically, to spuriously create a pseudo stereoscopic image, the pseudo three-dimensional image creating apparatus 1 combines an image obtained by cutting off an object area from an input image according to its corresponding binary mask image with an image obtained by performing projection deformation of a frame picture texture image.
  • The pseudo three-dimensional image creating apparatus 1 has an input image acquiring unit 11, a frame picture texture acquiring unit 12, a three-dimensional affine transformation parameter acquiring unit 13, a rectangular three-dimensional affine transformer 14, a frame picture combining parameter calculator 15, a frame picture combining unit 16, and an output unit 17.
  • The input image acquiring unit 11 acquires an input image and a binary mask image that specifies an object area on the input image, and supplies the acquired images to the frame picture combining parameter calculator 15. The input image is an RGB color image in red, green, and blue, for example. The binary mask image has the same resolution as the input image and holds one of two values such as 1 and 0 to indicate whether the relevant pixel is included in the object area, for example. The input image and binary mask image are arbitrarily selected or supplied by the user. Of course, the input image and binary mask image are made to correspond to each other.
  • The frame picture texture acquiring unit 12 acquires a texture image to be attached to a quadrangle frame picture in, for example, a square shape, and supplies the texture image to the frame picture combining unit 16. The texture image visually appears as a plane; an example of it is an image that simulates a white frame of a printed photo.
  • The three-dimensional affine transformation parameter acquiring unit 13 acquires three-dimensional affine transformation parameters, which are used in three-dimensional affine transformation performed on the frame picture texture image, and supplies these parameters to the rectangular three-dimensional affine transformer 14. The three-dimensional affine transformation parameters may be directly specified with numerals or may be arbitrarily set according to user input operations through graphical user interfaces (GUIs) such as mouse drags and scroll bars.
  • The a rectangular three-dimensional affine transformer 14 calculates rectangular parameters from the three-dimensional affine transformation parameters acquired from the three-dimensional affine transformation parameter acquiring unit 13 and supplies the calculated rectangular parameters to the frame picture combining parameter calculator 15. The rectangular parameters indicate the two-dimensional coordinate of the four vertexes of the frame picture texture image after the three-dimensional affine transformation and the central position of the rectangle. The aspect ratio of the original rectangle used for the transformation may be specified by the user by operating an operation unit (not shown). Alternatively, the aspect ratio of the frame picture texture image entered by operating the operation unit may be used instead.
  • The frame picture combining parameter calculator 15 calculates the positions and scales of the input image and binary mask image, supplied from the input image acquiring unit 11, and the frame picture to be combined, and supplies frame picture parameters to the frame picture combining unit 16 together with the input image and binary mask image. The frame picture parameters supplied to the frame picture combining unit 16 indicate the four two-dimensional vertex coordinates of the quadrangular frame picture in the image coordinate system. The structure of the frame picture combining parameter calculator 15 will be described later in detail with reference to FIG. 2.
  • The frame picture combining unit 16 combines the input image, the binary mask image, and a frame shape structure image together according to the frame picture combining parameters to create a pseudo three-dimensional image on which its object visually appears to be stereoscopic, and then output the created image to the output unit 17. Specifically, the frame picture combining unit 16 includes an object layer image creating unit 16 a and a frame layer image creating unit 16 b. The object layer image creating unit 16 a creates an image in the object area, that is, an object layer image from the input image, binary mask image, and frame shape structure image, according to the frame picture combining parameters. The frame layer image creating unit 16 b creates an image in the frame picture texture area, that is, a frame layer image from the input image, binary mask image, and frame shape structure image, according to the frame picture combining parameters. The frame picture combining unit 16 combines the object layer image and frame layer image, which have been thus created, together to create a combined image, which is a pseudo three-dimensional.
  • The output unit 17 receives a combined image created as a pseudo three-dimensional image by the frame picture combining unit 16, and outputs the received image. Frame picture combining parameter calculator
  • Next, the structure of the frame picture combining parameter calculator 15 will be described in detail with reference to FIG. 2.
  • The frame picture combining parameter calculator 15 has a mask barycenter calculator 51, a frame picture scale calculator 52, and a frame picture vertex calculator 53. The frame picture combining parameter calculator 15 determines constraint conditions, which are used to obtain a frame picture shape, from the binary mask image to determine the position and scale of the frame picture.
  • To obtain the barycenter position of the object shape from the binary image, the mask barycenter calculator 51 obtains an average of the positions of the pixels in the object area, that is, all pixels in the binary mask image as the barycenter position. Then, the mask barycenter calculator sends the average to the frame picture scale calculator 52.
  • The frame picture scale calculator 52 has a central position calculator 52 a, a scale calculator 52 b, and a scale deciding unit 52 c. The frame picture scale calculator 52 calculates a frame picture central position P_FRAME and a scale S_FRAME from the barycenter position and a frame setting angle θg, which is an input parameter, and sends the calculated values to the frame picture vertex calculator 53. The frame picture central position P_FRAME and scale S_FRAME will be described later in detail.
  • The frame picture vertex calculator 53 receives the frame picture central position P_FRAME and scale S_FRAME from the frame picture scale calculator 52, and outputs the four vertexes, which are frame picture combining parameters.
  • Pseudo Three-Dimensional Image Creation Process
  • A pseudo three-dimensional image creation process will be described next with reference to the flowchart in FIG. 3.
  • In step S11, the input image acquiring unit 11 acquires an input image and a binary mask image corresponding to the input image and then sends them to the frame picture combining parameter calculator 15. An exemplary input image and its corresponding binary mask image are respectively shown on the left and right in FIG. 4. In FIG. 4, the butterfly on the input image is an object image, so, on the binary mask image, pixels in the area in which the butterfly is displayed are displayed in white and pixels in the remaining area are displayed in black.
  • In step S12, the frame picture texture acquiring unit 12 acquires a frame picture texture image, which is selected when an operation unit (not shown) including a mouse and keyboard is operated, and sends the acquired image to the frame picture combining unit 16. An exemplary frame picture text image is shown in FIG. 5; the image is formed by pixels, the value of which is α. The outermost edge forming a frame is set to black, the pixel value a being 0; the inner edge next to the frame is set to white, the pixel value α being 1; the central part is set to black, the pixel value α being 0. That is, the frame picture texture image in FIG. 5 is formed from black and white edges.
  • In step S13, the three-dimensional affine transformation parameter acquiring unit 13 acquires three-dimensional affine transformation parameters, which are used to carry out three-dimensional affine transformation on the frame picture texture image, when the operation unit (not shown) is operated, and sends the acquired parameters to the rectangular three-dimensional affine transformer 14.
  • The three-dimensional affine transformation parameters are used to carry out affine transformation on a quadrangular frame picture so that the picture visually appears like a stereoscopic shape. Specifically, as shown in FIG. 6, these parameters are a rotation θx around the x axis, which is in the horizontal direction, a rotation θz around the z axis, which is line of sight, a distance f from an imaging position P to the frame used as the frame picture texture, which is a subject, a distance tx traveled in the x direction, which is horizontal to the image, and a distance ty traveled in the y direction, which is perpendicular to the image.
  • In step S14, the rectangular three-dimensional affine transformer 14 receives the three-dimensional affine transformation parameters sent from the three-dimensional affine transformation parameter acquiring unit 13, calculates rectangular parameters, and sends the calculated parameters to the frame picture combining parameter calculator 15.
  • Specifically, the rectangular three-dimensional affine transformer 14 obtains transformed coordinates by using a coordinate system, in which the central point of a rectangular frame picture is fixed to the origin (0, 0), the coordinate system being normalized to match the width in the x or y direction, whichever is longer. That is, when the rectangular frame picture is square, the rectangular three-dimensional affine transformer 14 sets the rectangular center RC and the four vertex coordinates p0 (−1, −1), p1 (1, −1), p2 (1, 1), p3 (−1, 1), which are taken before transformation. The rectangular three-dimensional affine transformer 14 then assigns the vertex coordinates p0 to p3, rectangular center RC, and three-dimensional affine transformation parameters to equation (1) to calculate vertex coordinates p0′ to p3′ and rectangular center RC′ transformed by three-dimensional affine transformation.

  • p′=TfTsRθxRθzp   (1)
  • where Rθz is a rotational transformation matrix, represented by equation (2), that corresponds to a rotation θz about the z axis, and Rθx is a rotational transformation matrix, represented by equation (3), that corresponds to a rotation θx about the x axis; Ts is a transformation matrix, represented by equation (4), that corresponds to the distances tx and ty, and Tf is a transformation matrix, represented by equation (5), that corresponds to the distances f.
  • R θ z = [ cos θ z - sin θ z 0 0 sin θ z cos θ z 0 0 0 0 1 0 0 0 0 1 ] ( 2 ) R θ x = [ 1 0 0 0 0 cos θ x sin θ x 0 0 - sin θ x cos θ x 0 0 0 0 1 ] ( 3 ) T s = [ 1 0 0 tx 0 1 0 ty 0 0 1 0 0 0 0 1 ] ( 4 ) T f [ 1 0 0 0 0 1 0 0 0 0 1 far 0 0 0 1 ] ( 5 )
  • As a result of the transformation, a frame picture texture image such as an upper image in FIG. 7, represented by the vertex coordinates p0 to p3 of a rectangle and its center RC, is transformed into a frame picture texture image such as a lower image in FIG. 7, represented by the vertexes p0′ to p3′ of another rectangle and its center RC′. In this process, only the four vertex coordinates are obtained, and the frame picture texture image itself is not handled.
  • In step S15, the frame picture combining parameter calculator 15 executes a frame picture combining parameter calculation process to calculate frame picture combining parameters and sends the calculated parameters to the frame picture combining unit 16.
  • Frame Picture Combining Parameter Calculation Process
  • The frame picture combining parameter calculation process will be then described with reference to the flowchart in FIG. 8.
  • In step S31, the mask barycenter calculator 51 calculates the mask barycenter position BC of the shape of the object from the binary mask image, and sends the calculated barycenter position to the frame picture scale calculator 52. Specifically, as shown in FIG. 9, the mask barycenter calculator 51 extracts pixels with a pixel value α of 1 (pixels in white in the drawing) from all pixels in the binary mask image, which forms an object of a butterfly, and determines the average coordinates of these pixel positions as the mask barycenter position BC.
  • In step S32, the frame picture scale calculator 52 controls the central position calculator 52 a to calculate the frame picture central position P_FRAME from the mask barycenter position BC received from the mask barycenter calculator 51 and from the frame setting angle θg, which is an input parameter.
  • Specifically, the central position calculator 52 a first calculates a contour point CP to determine the position of the frame picture. That is, the central position calculator 52 a obtains a vector RV, which has been rotated clockwise by the frame setting angle θg from the lower direction of the image, as shown in FIG. 9, the lower direction being handled as a reference vector. The central position calculator 52 a further obtains, as the contour position CP, a two-dimensional position at which the pixel value a first changes from 1 to 0 during a motion from the mask barycenter position BC in the direction of the vector RV, that is, at which the contour of the object area (boundary of the object area) is first encountered, as shown in FIG. 9. The contour position CP is the central position P_FRAME of the frame picture texture.
  • In step S33, the scale calculator 52 b sets the frame picture texture image to calculate the scale S_FRAME, which is the scale of the frame picture. Specifically, the scale calculator 52 b rotates the frame picture texture image formed by the vertex coordinates p0′ to p3′ of the rectangle and its center RC′, which are obtained after three-dimensional affine transformation, by the frame setting angle θg, to update the vertex coordinates to p0″ to p3″. That is, the frame picture texture image is rotated clockwise, centered around the rectangular center RC′ and the vertex coordinates p0′ to p3′ are updated to the vertex coordinates p0″ to p3″.
  • Accordingly, if the frame setting angle θg is 0 degree, for example, the frame picture texture is disposed at the bottom of the object; if θg is 90 degrees, the frame picture texture is disposed so that it stands on the left side of the object.
  • In step S34, the scale calculator 52 b determines a longer edge LE and a shorter edge SE from the vertex coordinates p0″ to p3″ to obtain a straight line of each edge. For example, the longer edge LE is the longest edge of the frame picture texture and the shorter edge SE is the edge opposite to the longer edge LE, as shown in FIG. 10. When the frame picture texture is traced clockwise, the edge placed next to the longer edge LE is the left edge LO and the edge placed next to the shorter edge SE is the right edge L1.
  • The scale calculator 52 b calculates, as a longer-edge scale S_LE, a scale when the longer edge LE passes through the farthest point in the direction of the vector RV of the binary mask image. Specifically, in the case shown in FIG. 10, the scale calculator 52 b calculates, as the longer-edge scale S_LE, the scale when the longer edge LE passes through the intersection F1 (on the straight line T4), which is the farthest point intersecting with the object image in the direction of the vector RV from the straight line T3, which passes through the mask barycenter position BC and is orthogonal to the vector RV. That is, when the frame picture is enlarged or reduced about the central position P_FRAME (contour point CP), the longer scale S_LE is obtained as an enlargement ratio or reduction ratio when the longer edge LE is disposed on the straight line T4.
  • In step S35, the scale calculator 52 b calculates, as a shorter-edge scale S_SE, a scale when the shorter edge SE passes through the farthest point in the direction opposite to the direction of the vector RV of the binary mask image. Specifically, in the case shown in FIG. 10, the scale calculator 52 b calculates, as the shorter-edge scale S_SE, the scale when the shorter edge SE passes through the intersection F3 (on the straight line T5), which is the farthest point intersecting with the object image in the direction opposite to the direction of the vector RV from the straight line T3, which passes through the mask barycenter position BC and is orthogonal to the vector RV. That is, when the frame picture is enlarged or reduced about the central position P FRAME (contour point CP), the shorter scale S_SE is obtained as an enlargement ratio or reduction ratio when the shorter edge SE is disposed on the straight line T5.
  • In step S36, as shown in FIG. 10, the scale calculator 52 b calculates, as a left-edge scale S_L0, a scale when the left edge L0 is in the direction of the vector RV relative to the straight line T3, which passes through the mask barycenter position BC and is perpendicular to the vector RV, and includes the intersection F1 (on the straight line T1) with the object image in the area R0 on the left edge L0 side relative to the straight line R0R that passes through the mask barycenter position BC and is parallel to the left edge L0 and when the left edge L0 passes through the intersection F1 with the object image, which is at the farthest point from the straight line R0R that passes through the mask barycenter position BC and is parallel to the left edge L0. That is, when the frame picture is enlarged or reduced about the central position P_FRAME (contour point CP), the left-edge scale S_L0 is obtained as the enlargement ratio or reduction ratio applied when the left-edge L0 is positioned on the straight line T1.
  • In step S37, the scale calculator 52 b calculates, as a right-edge scale S_L1, a scale when the right edge L1 is in the direction of the vector RV relative to the straight line T3, which passes through the mask barycenter position BC and is perpendicular to the vector RV, and includes the intersection F2 (on the straight line T2) with the object image in the area R1 on the right edge L1 side relative to the straight line R1L that passes through the mask barycenter position BC and is parallel to the right edge L1 and when the right edge L1 passes through the intersection F2 with the object image, which is at the farthest point from the straight line R1L that passes through the mask barycenter position BC and is parallel to the right edge L1. That is, when the frame picture is enlarged or reduced about the central position P_FRAME (contour point CP), the right-edge scale S_L1 is obtained as the enlargement ratio or reduction ratio applied when the right edge L1 is positioned on the straight line T2.
  • In step S38, the scale deciding unit 52 c calculates the scale S_FRAME of the frame picture texture by using the longer-edge scale S_LE, shorter-edge scale S_SE, left-edge scale S_L0, and right-edge scale S_L1, according to equation (6) below.

  • S_FRAME=MIN(β×MAX(S_LE, S L0, S L1), S_SE)   (6)
  • where β, which takes a value of 1 or more, is an arbitrary coefficient to adjust the size of the frame picture, MAX(A, B, C) is a function to select the maximum value of values A to C, MIN(D, E) is a function to select the minimum value of values D and E. Accordingly, the scale deciding unit 52 c obtains the maximum value of the longer-edge scale S_LE, left-edge scale S_L0, and right-edge scale S_L1 and also obtains the minimum value of the obtained maximum value and shorter-edge scale S_SE, as the scale S_FRAME of the frame picture texture. The frame picture scale calculator 52 then sends the calculated scale S_FRAME and central position P_FRAME to the frame picture vertex calculator 53.
  • Comparison with the shorter-edge scale S_SE is carried out only with MIN(D, E) in equation (6). This is because, the shorter-edge scale S_SE, the distance from the central position P_FRAME (contour point CP) to the farthest point of the object is longer when compared to the other farthest points, as shown in FIG. 10, that is, the shorter-edge scale S_SE is extremely larger than the other scales.
  • In step S39, the frame picture vertex calculator 53 uses the central position P_FRAME and scale S_FRAME of the frame picture texture, which have been received from the frame picture scale calculator 52, to perform parallel movement so that the central position RC″ of the frame picture texture matches the central position P_FRAME, which is the barycenter position BC.
  • In step S40, the frame picture vertex calculator 53 enlarges each edge about the central position of the frame picture texture by an amount equal to the scale S_FRAME.
  • In step S41, the frame picture vertex calculator 53 obtains the two-dimensional positions FP0 to FP3 of the four vertexes of the enlarged frame picture texture, and then sends the obtained two-dimensional positions FP0 to FP3 of the four vertexes to the frame picture combining unit 16 at a later stage as the frame picture combining parameters.
  • According to the processes described above, the frame picture combining parameters can be set so that the two-dimensional coordinates of the four vertexes of the frame picture texture become optimum for the object area on the basis of the longer edge, shorter edge, left edge, and right edge of the frame picture texture and the farthest distance in the object area.
  • Now, the process in the flowchart in FIG. 3 will be described again.
  • In step S15, the frame picture combining parameter calculation process is executed to calculate frame picture combining parameters, after which the sequence proceeds to step S16.
  • In step S16, the frame picture combining unit 16 controls the object layer image creating unit 16 a to create an object layer image from an input image and binary mask image. Specifically, for example, the object layer image creating unit 16 a creates, in the object area, an object layer image as shown in the upper left part of FIG. 11 from a binary mask image as shown in the lower left part of FIG. 11, the mask image being made up of pixels with the pixel value α being set to 1 and pixels with the pixel value α being set to 0 (indicating black).
  • In step S17, the frame picture combining unit 16 controls the frame layer image creating unit 16 b to create a frame layer image rendered by mapping the frame picture texture image to the frame picture texture, which has undergone projection deformation by the frame picture combination parameters. Specifically, for example, the frame layer image creating unit 16 b creates a binary mask image of a quadrangular frame picture, as shown in the lower-right part of FIG. 11, according to two-dimensional vertex coordinates given as the frame picture parameters. In an area in which the frame picture is drawn on the binary mask image of the frame picture, α is 1, where the pixel values of the input image are output; in the other area, α is 0, where all pixel values are 0. Then, the frame layer image creating unit 16 b creates the frame layer image, as shown in the upper right part of FIG. 11, from the input image and the created binary mask image of the frame picture.
  • In step S18, the frame picture combining unit 16 combines the object layer image and frame layer image together to create a combined pseudo three-dimensional image as shown in FIG. 12, and sends the combined image to the output unit 17.
  • In step S19, the output unit 17 outputs the combined pseudo three-dimensional combined image, which has been created.
  • The processes described above can thus create a pseudo three-dimensional image that uses, as depth perception of a person, an overlap of a frame picture texture image and a perspective of a rectangular object for which projection transformation has been performed.
  • That is, as for the eyesight of a person, depth perception can be generally attained by obtaining a clue such as perspective projection and vanishing points from a rectangle for which projection transformation has been performed. A fore-and-aft relation can also be obtained from an order in which an object image and frame image overlap, as the eyesight. To have a person recognize the fore-and-aft relation represented by a perspective and overlap through the eyesight in this way, it may suffice to satisfy conditions as shown in FIG. 13.
  • Specifically, a first condition is that the edge on the far side of a frame picture, that is, the shorter edge overlaps an object and is behind the object. More specifically, the first condition is that, for example, as shown in FIG. 13, the shorter edge of a frame picture V2 has intersections with the boundary of an object area V1 and only the object is displayed in the object area V1.
  • A second condition is that the edge on the near side of the frame picture, that is, the longer edge has no intersection with the boundary of the object area. Specifically, the second condition is that, for example, as shown in FIG. 13, the longer edge of the frame picture V2 has no intersection with the boundary of the object area V1.
  • A third condition is that the frame picture has a shape that can be three-dimensionally present. Specifically, the third condition is that the frame picture V2 has a shape that can be three-dimensionally present.
  • The first and second conditions are satisfied by disposing the longer edge B of the frame picture V2, a straight line C passing through a bottom point of the object area, and the shorter edge A of the frame picture V2 in that order from the near side, as shown in FIG. 13. That is, it suffices that the shorter side of the frame picture V2 has intersections with the boundary of the object area, the object is displayed between the intersections, and the shorter edge of the frame picture V2 has no intersection with the boundary of the object area.
  • In the frame picture combining parameter calculation process in FIG. 8, any one of the scales, which have been enlarged or reduced about the central position P_FRAME so that the longer edge, shorter edge, right edge, or left edge passes its farthest point of the object area, is set as the scale S_FRAME. Accordingly, the scale of the frame picture is determined so that the longer edge has no intersection with the object area and the shorter edge has intersections with the object area.
  • As a result, since the object image is combined with the frame picture enlarged or reduced as described above, a pseudo three-dimensional image that visually appears to be stereoscopic can be created.
  • According to the embodiments of the present invention, a pseudo three-dimensional image can be easily created by combining an object image, which is obtained from an input image and a binary mask image that specifies an object area on the input image, with a planar image that simulates a picture frame or architrave.
  • When the frame picture is deformed only by three-dimensional affine transformation, the frame picture can remain in a three-dimensional shape. When a texture is mapped to the frame picture itself by, for example, projection transformation, information usable as a clue of a perspective can be given, improving depth perception.
  • As shown in FIG. 14, for example, when two opposite edges of a quadrangular frame picture intersect the object area of an airplane-shaped toy, a pseudo three-dimensional image that a user can enjoy can also be created. In this case, to determine the shape of the frame picture, the barycenter of the object area is obtained, for example, after which, centered around the barycenter, the widths can be calculated as twice the maximum value and minimum value in the X direction of the object area, and the heights can be calculated as half the maximum value and minimum value in the Y direction. A depth emphasizing effect can be obtained just by placing the frame picture behind the object.
  • The frame picture combining parameter calculator 15 can also place the frame picture upside down or oppositely, rather than on the ground, by adjusting the frame setting angle θg. Specifically, as shown in FIG. 15, the frame picture can be placed behind the airplane-shaped toy, which is the object, or inverted parallel to the toy.
  • The frame picture combining parameter calculator 15 may also calculate the N-order moment of the binary mask image and the center of a bounding box or the center of a circumscribed circle as the parameters to calculate the frame picture shape. That is, mask image distribution may be considered for the central position instead of using a simple barycenter position.
  • The frame picture combining parameter calculator 15 may obtain the parameters to calculate the frame picture shape not only from the binary mask image but also from the input image itself. Specifically, the vanishing points of the image or the ground may be detected to determine the shape and position of the frame picture so that an edge of the frame picture is placed along a varnishing line of the input image or in a ground area. For a method of automatically detecting a varnishing line from an image, see “A new Approach for Vanishing Point Detection in Architectural Environments, Carsten Rother, BMVC2000”.
  • In this method, edges of an architectural structure are detected and the direction of parallel edges is statistically processed to calculate varnishing points. Two varnishing points obtained by this method can be used to calculate the frame picture combining parameters. Specifically, the constraint that opposite edges of the frame picture converge at two different varnishing points is added in determination of the position and shape of the frame picture.
  • A projection transformation parameter f of the frame picture may also be determined by obtaining an approximate object size from object classification based on machine learning.
  • Specifically, a pseudo three-dimensional image that is more naturally stereoscopic may be created by using camera parameters for macro photography when the object is small like a cup or by using camera parameters for telescopic photography when the object is large like a building. For the method of classifying objects, see “Object Detection by Joint Feature Based on Relations of Local Features, Fujiyoshi Hironobu”. In this method, machine learning is carried out in advance for features based on relation of local features of an object and the image if found from an image.
  • The frame picture combining parameter calculator 15 may also render an object picture to which a texture image is not mapped, during frame layer image creation. In this case, a rectangle may be drawn just by specifying a color for the frame picture or the pixel colors of the input image may be drawn.
  • A user interface may be provided so that the user can correct the shape of the frame picture while viewing the pseudo three-dimensional image calculated by the frame picture combining unit 16. Specifically, the user may operate the user interface to move the four vertexes of the frame picture or move the entire frame picture. Alternatively, an interface to change the burnishing point to deform the frame picture may be provided.
  • A user input may be supplied to the three-dimensional affine transformation parameter acquiring unit 13 to directly update the frame shape parameters.
  • The frame picture combining unit 16 may deform the binary mask image itself. Specifically, when a frame picture object is combined at the bottom of an object area, specified by the binary mask image, that continuously extends to the bottom of the image, the binary mask image may be cut so that the binary mask image does not extend beyond the frame picture toward the near side, creating a pseudo three-dimensional image that is naturally stereoscopic.
  • Specifically, when a binary mask image as shown in the upper-right part of FIG. 16 is input for an input image as shown in the upper-left part of FIG. 16, part of the fountain base on which a doll, which is an object, is mounted is cut to match the frame picture as shown in the lower-left part of FIG. 16. When the input image is processed by using the resulting binary mask image shown in the lower-left part of FIG. 16, a pseudo three-dimensional image, as shown in the lower-right part of FIG. 16, in which the fountain base is cut to match the frame picture shape can be created.
  • The input image is not limited to a still image; it may be a moving image. When the input image is a moving image, the frame picture parameters may be determined from a representative moving image frame and a mask image to determine the shape of the frame picture. To determine the shape of the frame picture, the frame picture parameters may also be determined for each moving image frame.
  • The frame picture may not be a still image; an image created by changing the three-dimensional affine transformation parameters or frame setting angle parameters may be animated.
  • Not only a processing result is presented by a combination of one type of parameter, but also a plurality of processing results may be output by a combination of a plurality of parameters. That is, the pseudo three-dimensional image creating apparatus may present pseudo three-dimensional images created by a combination of a plurality of parameters within a predetermined parameter range, and the user may select a preferable image from the presented images.
  • The frame picture combining unit 16 may use processed input images, such as blurred input images, gray-scaled images, or images with low brightness, instead of filling the areas other than the frame picture and object, that is, the background with a background color.
  • An alpha map or a try-map may be input as the binary mask image.
  • A plurality of three-dimensional transformation parameters may be prestored in a database, and appropriate parameters may be selected from the database and input as the three-dimensional transformation parameters acquired by the three-dimensional affine transformation parameter acquiring unit 13.
  • Specifically, the three-dimensional affine transformation parameter acquiring unit 13 creates, in advance, reference binary mask images and their three-dimensional affine transformation parameters by which the frame picture shape becomes optimum for the reference binary mask images, and stores the reference binary mask images three-dimensional affine transformation parameters in correspondence to each other. The three-dimensional affine transformation parameter acquiring unit 13 then selects, from the database, a reference binary mask image having a high similarity to the entered binary mask image, and acquires and outputs the three-dimensional affine transformation parameters stored in correspondence to the selected reference binary mask image.
  • Accordingly, the appropriate three-dimensional affine transformation parameters can be acquired from the database and can be used to deform or combine a frame picture object.
  • For a method of calculating a similarity to an image, see “Zhong Wu, Qifa Ke, Michael Isard, and Jian Sun. Bundling Features for Large Scale Partial-Duplicate Web Image Search. CVPR 2009 (oral)”. In this method, a feature called SIFT at a key point and an area feature called MSER are used to represent the feature of an image, and the similarity of the image is obtained by calculating the distances of these features in a feature space. That is, binary mask image features and reference binary mask image features, which are calculated in advance and stored in the database, may be obtained and compared to find an image with the largest similarity, and the three-dimensional affine transformation parameter stored in correspondence to the image may be used.
  • The similarity calculation may be carried out not only between binary mask images but also between images. That is, both the feature of the input image and the features of the binary mask image may be used together in the similarity calculation as a new feature.
  • The frame picture may be a three-dimensional object rather than a two-dimensional texture. In this case, the three-dimensional object is mapped to an XY plane, and a bounding rectangle of the mapped three-dimensional object is calculated as the input rectangle. The bounding rectangle is used as an ordinary two-dimensional rectangle to determine its position and scale in advance. After the three-dimensional object undergoes three-dimensional affine transformation as in the bounding rectangle, a position and scale are applied to the three-dimensional object, which is then combined with the object in the input image. In this way, the object image can be combined with a curved frame or thickened frame to create a three-dimensional image for which depth perception is enhanced.
  • Although a series of processes described above can be executed by hardware, it can also be executed by software. When the series of processes is executed by software, programs constituting the software are installed from a storage medium into, for example, a computer embedded in dedicated hardware or a general-purpose personal computer that can execute various functions after various programs are installed therein.
  • FIG. 17 shows an example of the structure of a general-purpose personal computer, in which a central processing unit (CPU) 1001 is included. An input/output interface 1005 is connected to the CPU 1001 via a bus 1004. A read-only memory (ROM) 1002 and a random-access memory (RAM) 1003 are connected to the bus 1004.
  • Units connected to the input/output interface 1005 are an input unit 1006, including a keyboard, a mouse, and other input devices, through which the user enters operation commands, an output unit 1007 that outputs processing operation screens and images obtained as a result of processing to a display device, a storage unit 1008 including a hard disk drive that stores programs and various types of data, and a communication unit 1009, including a local area network (LAN) adapter, which executes communication processing through a network typified by the Internet. Another unit connected to the input/output interface 1005 is a drive 1010 that writes and read data to and from a removable media 1011 such as a magnetic disc (including a flexible disc), an optical disc (including a compact disc read-only memory (CD-ROM), and a digital versatile disc (DVD)), a magneto-optical disc (including a mini-disc (MD)), or a semiconductor memory.
  • The CPU 1001 executes various processes according to the programs that have been stored in the ROM 1002 or that are read from the removable media 1011 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory, installed in the storage unit 1008, and loaded from the storage unit 1008 into the RAM 1003. Data used by the CPU 1001 to execute the various processes is also stored in the RAM 1003 at appropriate times.
  • For steps describing processes in this description, the processes described so that they are executed in time series in the order described may include processes that are not executed in time series but in parallel or individually.
  • The present application contains subject matter related to that disclosed in Japanese Priority Patent Application JP 2009-195900 filed in the Japan Patent Office on Aug. 26, 2009, the entire content of which is hereby incorporated by reference.
  • It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.

Claims (10)

What is claimed is:
1. An image processing apparatus creating a pseudo three-dimensional image that improves depth perception of the image, the apparatus comprising:
input image acquiring means for acquiring an input image and a binary mask image that specifies an object area on the input image;
combining means for extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image; and
frame picture combining position determining means for determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and another of the pair does not include an intersection with the boundary of the object area.
2. The image processing apparatus according to claim 1, wherein the quadrangular frame picture is formed so that the edge that does not include the intersection with the boundary of the object area is longer than the edge that includes the intersection.
3. The image processing apparatus according to claim 1, wherein a position of the quadrangular frame picture may be determined by rotating the quadrangular frame picture around a predetermined position.
4. The image processing apparatus according to claim 1, wherein the quadrangular frame picture is formed by carrying out three-dimensional affine transformation on a predetermined quadrangular frame picture.
5. The image processing apparatus according to claim 1, wherein the combining means creates the combined image by continuously deforming a shape of the quadrangular frame picture and extracting the pixels in the area inside the quadrangular frame picture of the input image and the pixels in the object area on the binary mask image of the input image.
6. The image processing apparatus according to claim 1, wherein the combining means creates a plurality of combined images by extracting the pixels in the area inside the quadrangular frame picture, which has a plurality of types of shapes or is formed at a predetermined position, and the pixels in the object area, specified by the binary mask image, on the input image.
7. The image processing apparatus according to claim 1, wherein the combining means creates the combined image:
by storing input images or binary mask images, each of which is used to create the combined image, in correspondence to frame shape parameters, which include a rotational angle of the quadrangular frame picture, three-dimensional affine transformation parameters, and positions;
by forming a frame picture with a predetermined quadrangular shape, according to the frame shape parameters stored in correspondence to a stored input image or binary mask image that is found, by comparison, to be most similar to the input image or binary mask image obtained by the input image acquiring means in the stored input images and binary mask images; and
by extracting the pixels in the area inside the quadrangular frame picture of the input image and the pixels in the object area, specified by the binary mask image, on the input image.
8. An image processing method for use in an image processing apparatus operable to create a pseudo three-dimensional image that improves depth perception of the image, the method comprising the steps of:
acquiring an input image and a binary mask image that specifies an object area on the input image;
extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image; and
determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and another of the pair does not include an intersection with the boundary of the object area.
9. A program executable by a computer that controls an image processing apparatus operable to create a pseudo three-dimensional image that improves depth perception of the image so as to execute a process including the steps of:
acquiring an input image and a binary mask image that specifies an object area on the input image;
extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image; and
determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and another of the pair does not include an intersection with the boundary of the object area.
10. An image processing apparatus creating a pseudo three-dimensional image that improves depth perception of the image, the apparatus comprising:
an input image acquiring unit acquiring an input image and a binary mask image that specifies an object area on the input image;
a combining unit extracting pixels in an area inside a quadrangular frame picture of the input image and pixels in the object area, specified by the binary mask image, on the input image to create a combined image; and
a frame picture combining position determining unit determining a position on the combined image at which the quadrangular frame picture is placed so that one of a pair of opposite edges of the quadrangular frame picture includes an intersection with a boundary of the object area and another of the pair does not include an intersection with the boundary of the object area.
US12/859,110 2009-08-26 2010-08-18 Image processing apparatus, image processing method, and program Abandoned US20110050685A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JPP2009-195900 2009-08-26
JP2009195900A JP5299173B2 (en) 2009-08-26 2009-08-26 Image processing apparatus, image processing method, and program

Publications (1)

Publication Number Publication Date
US20110050685A1 true US20110050685A1 (en) 2011-03-03

Family

ID=43624175

Family Applications (1)

Application Number Title Priority Date Filing Date
US12/859,110 Abandoned US20110050685A1 (en) 2009-08-26 2010-08-18 Image processing apparatus, image processing method, and program

Country Status (3)

Country Link
US (1) US20110050685A1 (en)
JP (1) JP5299173B2 (en)
CN (1) CN102005059B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100201681A1 (en) * 2009-02-09 2010-08-12 Microsoft Corporation Image Editing Consistent with Scene Geometry
US20140184591A1 (en) * 2011-08-23 2014-07-03 Tomtom International B.V. Methods of and apparatus for displaying map information
US9188433B2 (en) 2012-05-24 2015-11-17 Qualcomm Incorporated Code in affine-invariant spatial mask
CN110826357A (en) * 2018-08-07 2020-02-21 北京市商汤科技开发有限公司 Method, device, medium and equipment for three-dimensional detection and intelligent driving control of object
CN112651896A (en) * 2020-12-30 2021-04-13 成都星时代宇航科技有限公司 Valid vector range determining method and device, electronic equipment and readable storage medium
WO2022093112A1 (en) * 2020-10-30 2022-05-05 北京字跳网络技术有限公司 Image synthesis method and device, and storage medium
US11481941B2 (en) * 2020-08-03 2022-10-25 Google Llc Display responsive communication system and method

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103096046A (en) * 2011-10-28 2013-05-08 深圳市快播科技有限公司 Video frame processing method, device and player
US8971611B2 (en) 2012-02-08 2015-03-03 JVC Kenwood Corporation Image process device, image process method, and image process program
JP6930091B2 (en) * 2016-11-15 2021-09-01 富士フイルムビジネスイノベーション株式会社 Image processing equipment, image processing methods, image processing systems and programs
WO2018155670A1 (en) * 2017-02-27 2018-08-30 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Image distribution method, image display method, image distribution device and image display device
CA3073618A1 (en) * 2017-09-01 2019-03-07 Magic Leap, Inc. Generating a new frame using rendered content and non-rendered content from a previous perspective
CN110942420B (en) * 2018-09-21 2023-09-15 阿里巴巴(中国)有限公司 Method and device for eliminating image captions
CN109949208B (en) * 2019-02-21 2023-02-07 深圳市广德教育科技股份有限公司 Internet-based automatic 3D clothing pattern generation system
JP7231530B2 (en) * 2019-11-20 2023-03-01 アンリツ株式会社 X-ray inspection device
CN117368210B (en) * 2023-12-08 2024-02-27 荣旗工业科技(苏州)股份有限公司 Defect detection method based on multi-dimensional composite imaging technology

Citations (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6028955A (en) * 1996-02-16 2000-02-22 Microsoft Corporation Determining a vantage point of an image
US6166744A (en) * 1997-11-26 2000-12-26 Pathfinder Systems, Inc. System for combining virtual images with real-world scenes
US20020048401A1 (en) * 2000-09-01 2002-04-25 Yuri Boykov Graph cuts for binary segmentation of n-dimensional images from object and background seeds
US6414678B1 (en) * 1997-11-20 2002-07-02 Nintendo Co., Ltd. Image creating apparatus and image display apparatus
US20020093513A1 (en) * 1999-02-03 2002-07-18 Yakov Kamen Mechanism and apparatus for realistic 3D model creation using interactive scissors
US20020113791A1 (en) * 2001-01-02 2002-08-22 Jiang Li Image-based virtual reality player with integrated 3D graphics objects
US20030081836A1 (en) * 2001-10-31 2003-05-01 Infowrap, Inc. Automatic object extraction
US20030137508A1 (en) * 2001-12-20 2003-07-24 Mirko Appel Method for three dimensional image reconstruction
US6686926B1 (en) * 1998-05-27 2004-02-03 In-Three, Inc. Image processing system and method for converting two-dimensional images into three-dimensional images
US20050157926A1 (en) * 2004-01-15 2005-07-21 Xerox Corporation Method and apparatus for automatically determining image foreground color
US20050196070A1 (en) * 2003-02-28 2005-09-08 Fujitsu Limited Image combine apparatus and image combining method
US20050219240A1 (en) * 2004-04-05 2005-10-06 Vesely Michael A Horizontal perspective hands-on simulator
US20050271273A1 (en) * 2004-06-03 2005-12-08 Microsoft Corporation Foreground extraction using iterated graph cuts
US20060132482A1 (en) * 2004-11-12 2006-06-22 Oh Byong M Method for inter-scene transitions
US20060193509A1 (en) * 2005-02-25 2006-08-31 Microsoft Corporation Stereo-based image processing
US20060214932A1 (en) * 2005-03-21 2006-09-28 Leo Grady Fast graph cuts: a weak shape assumption provides a fast exact method for graph cuts segmentation
US20060285747A1 (en) * 2005-06-17 2006-12-21 Microsoft Corporation Image segmentation
US20070014473A1 (en) * 2005-07-15 2007-01-18 Siemens Corporate Research Inc System and method for graph cuts image segmentation using a shape prior
US20070031037A1 (en) * 2005-08-02 2007-02-08 Microsoft Corporation Stereo image segmentation
US20070269108A1 (en) * 2006-05-03 2007-11-22 Fotonation Vision Limited Foreground / Background Separation in Digital Images
US20080137989A1 (en) * 2006-11-22 2008-06-12 Ng Andrew Y Arrangement and method for three-dimensional depth image construction
US20080198175A1 (en) * 2007-02-20 2008-08-21 Microsoft Corporation Drag-And-Drop Pasting For Seamless Image Composition
US20090080774A1 (en) * 2007-09-24 2009-03-26 Microsoft Corporation Hybrid Graph Model For Unsupervised Object Segmentation
US20090116732A1 (en) * 2006-06-23 2009-05-07 Samuel Zhou Methods and systems for converting 2d motion pictures for stereoscopic 3d exhibition
US7567246B2 (en) * 2003-01-30 2009-07-28 The University Of Tokyo Image processing apparatus, image processing method, and image processing program
US20100201681A1 (en) * 2009-02-09 2010-08-12 Microsoft Corporation Image Editing Consistent with Scene Geometry
US7907793B1 (en) * 2001-05-04 2011-03-15 Legend Films Inc. Image sequence depth enhancement system and method
US20110115787A1 (en) * 2008-04-11 2011-05-19 Terraspark Geosciences, Llc Visulation of geologic features using data representations thereof

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3030485B2 (en) * 1994-03-17 2000-04-10 富士通株式会社 Three-dimensional shape extraction method and apparatus
JPH0991451A (en) * 1995-09-26 1997-04-04 Matsushita Electric Ind Co Ltd Image edit device
DE69915901T2 (en) * 1998-01-14 2004-09-02 Canon K.K. Image processing device
JP3603118B2 (en) * 2001-06-08 2004-12-22 東京大学長 Pseudo three-dimensional space expression system, pseudo three-dimensional space construction system, game system, and electronic map providing system
JP4080386B2 (en) * 2003-07-01 2008-04-23 日本電信電話株式会社 Depth information regeneration method, depth information regeneration device, program, and recording medium
CN1296873C (en) * 2004-07-15 2007-01-24 浙江大学 Travel-in-picture method based on relative depth computing
US7525555B2 (en) * 2004-10-26 2009-04-28 Adobe Systems Incorporated Facilitating image-editing operations across multiple perspective planes
JP4541397B2 (en) * 2007-11-05 2010-09-08 日本電信電話株式会社 Pseudo three-dimensional image generation apparatus, pseudo three-dimensional image generation method, and pseudo three-dimensional image generation program

Patent Citations (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6028955A (en) * 1996-02-16 2000-02-22 Microsoft Corporation Determining a vantage point of an image
US6222937B1 (en) * 1996-02-16 2001-04-24 Microsoft Corporation Method and system for tracking vantage points from which pictures of an object have been taken
US6414678B1 (en) * 1997-11-20 2002-07-02 Nintendo Co., Ltd. Image creating apparatus and image display apparatus
US6166744A (en) * 1997-11-26 2000-12-26 Pathfinder Systems, Inc. System for combining virtual images with real-world scenes
US6686926B1 (en) * 1998-05-27 2004-02-03 In-Three, Inc. Image processing system and method for converting two-dimensional images into three-dimensional images
US20020093513A1 (en) * 1999-02-03 2002-07-18 Yakov Kamen Mechanism and apparatus for realistic 3D model creation using interactive scissors
US20020048401A1 (en) * 2000-09-01 2002-04-25 Yuri Boykov Graph cuts for binary segmentation of n-dimensional images from object and background seeds
US20020113791A1 (en) * 2001-01-02 2002-08-22 Jiang Li Image-based virtual reality player with integrated 3D graphics objects
US7907793B1 (en) * 2001-05-04 2011-03-15 Legend Films Inc. Image sequence depth enhancement system and method
US20030081836A1 (en) * 2001-10-31 2003-05-01 Infowrap, Inc. Automatic object extraction
US20030137508A1 (en) * 2001-12-20 2003-07-24 Mirko Appel Method for three dimensional image reconstruction
US7567246B2 (en) * 2003-01-30 2009-07-28 The University Of Tokyo Image processing apparatus, image processing method, and image processing program
US20050196070A1 (en) * 2003-02-28 2005-09-08 Fujitsu Limited Image combine apparatus and image combining method
US20050157926A1 (en) * 2004-01-15 2005-07-21 Xerox Corporation Method and apparatus for automatically determining image foreground color
US20050219240A1 (en) * 2004-04-05 2005-10-06 Vesely Michael A Horizontal perspective hands-on simulator
US20050271273A1 (en) * 2004-06-03 2005-12-08 Microsoft Corporation Foreground extraction using iterated graph cuts
US20060132482A1 (en) * 2004-11-12 2006-06-22 Oh Byong M Method for inter-scene transitions
US20060193509A1 (en) * 2005-02-25 2006-08-31 Microsoft Corporation Stereo-based image processing
US20060214932A1 (en) * 2005-03-21 2006-09-28 Leo Grady Fast graph cuts: a weak shape assumption provides a fast exact method for graph cuts segmentation
US20060285747A1 (en) * 2005-06-17 2006-12-21 Microsoft Corporation Image segmentation
US20070014473A1 (en) * 2005-07-15 2007-01-18 Siemens Corporate Research Inc System and method for graph cuts image segmentation using a shape prior
US20070031037A1 (en) * 2005-08-02 2007-02-08 Microsoft Corporation Stereo image segmentation
US20070269108A1 (en) * 2006-05-03 2007-11-22 Fotonation Vision Limited Foreground / Background Separation in Digital Images
US20090116732A1 (en) * 2006-06-23 2009-05-07 Samuel Zhou Methods and systems for converting 2d motion pictures for stereoscopic 3d exhibition
US20080137989A1 (en) * 2006-11-22 2008-06-12 Ng Andrew Y Arrangement and method for three-dimensional depth image construction
US20080198175A1 (en) * 2007-02-20 2008-08-21 Microsoft Corporation Drag-And-Drop Pasting For Seamless Image Composition
US20090080774A1 (en) * 2007-09-24 2009-03-26 Microsoft Corporation Hybrid Graph Model For Unsupervised Object Segmentation
US20110115787A1 (en) * 2008-04-11 2011-05-19 Terraspark Geosciences, Llc Visulation of geologic features using data representations thereof
US20100201681A1 (en) * 2009-02-09 2010-08-12 Microsoft Corporation Image Editing Consistent with Scene Geometry

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Amit Shesh, Antonio Criminisi, Carsten Rother, Gavin Smyth, "3D-aware Image Editing for Out Of Bounds Photography", May/2009, Graphics Interface Conference. *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100201681A1 (en) * 2009-02-09 2010-08-12 Microsoft Corporation Image Editing Consistent with Scene Geometry
US8436852B2 (en) 2009-02-09 2013-05-07 Microsoft Corporation Image editing consistent with scene geometry
US20140184591A1 (en) * 2011-08-23 2014-07-03 Tomtom International B.V. Methods of and apparatus for displaying map information
US9710962B2 (en) * 2011-08-23 2017-07-18 Tomtom Navigation B.V. Methods of and apparatus for displaying map information
US9188433B2 (en) 2012-05-24 2015-11-17 Qualcomm Incorporated Code in affine-invariant spatial mask
US9207070B2 (en) 2012-05-24 2015-12-08 Qualcomm Incorporated Transmission of affine-invariant spatial mask for active depth sensing
US9448064B2 (en) 2012-05-24 2016-09-20 Qualcomm Incorporated Reception of affine-invariant spatial mask for active depth sensing
CN110826357A (en) * 2018-08-07 2020-02-21 北京市商汤科技开发有限公司 Method, device, medium and equipment for three-dimensional detection and intelligent driving control of object
US11481941B2 (en) * 2020-08-03 2022-10-25 Google Llc Display responsive communication system and method
WO2022093112A1 (en) * 2020-10-30 2022-05-05 北京字跳网络技术有限公司 Image synthesis method and device, and storage medium
GB2605307A (en) * 2020-10-30 2022-09-28 Beijing Zitiao Network Technology Co Ltd Image synthesis method and device, and storage medium
CN112651896A (en) * 2020-12-30 2021-04-13 成都星时代宇航科技有限公司 Valid vector range determining method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN102005059A (en) 2011-04-06
JP2011048586A (en) 2011-03-10
JP5299173B2 (en) 2013-09-25
CN102005059B (en) 2013-03-20

Similar Documents

Publication Publication Date Title
US20110050685A1 (en) Image processing apparatus, image processing method, and program
US10861232B2 (en) Generating a customized three-dimensional mesh from a scanned object
CN108648269B (en) Method and system for singulating three-dimensional building models
US10777002B2 (en) 3D model generating system, 3D model generating method, and program
US6529206B1 (en) Image processing apparatus and method, and medium therefor
JP4981135B2 (en) How to create a diagonal mosaic image
CN107484428B (en) Method for displaying objects
US20190362539A1 (en) Environment Synthesis for Lighting An Object
US10607405B2 (en) 3D model generating system, 3D model generating method, and program
US6556195B1 (en) Image processing device and image processing method
JP7370527B2 (en) Method and computer program for generating three-dimensional model data of clothing
US7265761B2 (en) Multilevel texture processing method for mapping multiple images onto 3D models
JP2019510297A (en) Virtual try-on to the user's true human body model
JP3626144B2 (en) Method and program for generating 2D image of cartoon expression from 3D object data
EP4036790A1 (en) Image display method and device
US20200211255A1 (en) Methods, devices, and computer program products for checking environment acceptability for 3d scanning
US11080920B2 (en) Method of displaying an object
Arpa et al. Perceptual 3D rendering based on principles of analytical cubism
JP2002163640A (en) Outline drawing device
JP3149389B2 (en) Method and apparatus for overlaying a bitmap image on an environment map
CN116152389B (en) Visual angle selection and texture alignment method for texture mapping and related equipment
WO2018151612A1 (en) Texture mapping system and method
JP2000057378A (en) Image processor, image processing method, medium, and device and method for extracting contour
US11961200B2 (en) Method and computer program product for producing 3 dimensional model data of a garment
JP7455546B2 (en) Image processing device, image processing method, and program

Legal Events

Date Code Title Description
AS Assignment

Owner name: SONY CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAMADA, HIDESHI;REEL/FRAME:024857/0490

Effective date: 20100709

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION