影像處理相關範疇 ․影像處理可分為六個面向: (1) 影像輸入數位化與影像顯示(輸出) (2) 影像強化 和影像失真還原 (涉及較多數學處理) (3) 影像編碼與壓縮 (4) 外形處理: 為影像 特徵的重要處理方式, 例如影像骨架處理為pattern recognition常用的處理技巧 (5) 影像分析/分割/分類、特徵擷取、特徵表示與描述 (6) 彩色影像處理 (7) computer vision 離不開影像表示、解讀與處理 ․同樣用來處理圖形資訊的學門除影像處理外, 尚有電腦圖學與pattern recognition--是影像處理的後級處理部份, 屬於智慧認知此一術語所指的過程; 同一物 件的旋轉、放大、縮小的特徵向量大致相近--這也是圖樣分類的一個依據。藉 由影像分析, 將物件從背景中抽離, 再擷取有用特徵--從中認識、描述、分割影 像及作分類; 涉及數學上的機率, 例如: 一個圖樣x來自 i 類的機率以 p ( i x ) 表 示, 而 p ( x k ) 為 k 類別中圖樣x的機率密度函數, p ( k ) 則是出現 k 類的機率; p(x)與類別無關, 稱為x的相對頻率(還是機率)。w k 為第k次疊代的加權值, 分類 T 錯誤則改變加權值; 其訓練法則如右: w k 1 w k cx k if wk xk 0 and xk i T 且常需計算 x , x2 w w cx if w x 0 and x k 1 k w k 1 w k k others k k k j ․影像處理後的資訊仍為一圖形, 圖訊識別處理後的資訊為抽象、精簡的數據; 電腦圖學則是研究如何由一些文字敘述產生所對應圖形 (例如圓心座標、半 徑和顏色)的方法, 基本上電腦圖學和圖訊識別可看成是兩個相反的程序。影像 分析是希望從輸入的影像中, 找出圖樣的形狀 (圓形或方形或…)、中心位置和 影像分析 顏色等, 其與電腦圖學之關係如右: 影像 描述 電腦圖學 Topic on Rasterization ․Since all modern displays are raster-oriented, the difference between rasteronly and vector graphics comes down to where they are rasterised; client side in the case of vector graphics, as opposed to already rasterised on the (web) server. 光柵式掃瞄 ․Basic Approach: The most basic algorithm takes a 3D scene, described as polygons, and renders it onto a 2D surface, usually a computer monitor. Polygons are themselves represented as collections of triangles. Triangles are represented by 3 vertices in 3D-space. At a very basic level, rasterizers simply take a stream of vertices, transform them into corresponding 2D points on the viewer’s monitor and fill in the transformed 2D triangles as appropriate. 四元數 ․Transformations are usually performed by matrix multiplication. Quaternion math may also be used. The main transformations are translation, scaling, rotation, and projection. A 3D vertex may be transformed by augmenting an extra variable (known as a "homogeneous variable") and left multiplying the resulting 4-component vertex by a 4 x 4 transformation matrix. ․A translation is simply the movement of a point from its original location to another location in 3-space by a constant offset. Translations can be represented by the leftmost matrix, where X, Y, and 1 0 0 x x 0 0 0 0 1 0 y Z are the offsets in the 3 dimensions, respectively. 0 y 0 0 0 0 0 1 0 0 z 1 0 0 0 z 0 0 0 1 ․A scaling transformation is performed by multiplying the position of a vertex by a scalar value. This has the effect of scaling a vertex with respect to the origin. Scaling can be represented by the upright matrix, and X, Y, and Z are the values by which each of the 3-dimensions are multiplied. Asymmetric scaling can be accomplished by varying the values of X, Y, and Z. ․Rotation matrices depend on the axis around which a point is to be rotated. 0 sin 0 0 0 cos cos sin 0 0 1) Rotation about the X-axis: 1 0 0 1 0 0 0 cos sin 0 sin cos 0 0 2) Rotation about the Y-axis: 0 sin cos 0 sin 0 cos 0 0 0 1 0 3) Rotation about the Z-axis: 0 0 0 0 1 0 1 0 0 1 0 0 (1) (2) θ in all each of these cases represent the angle of rotation. (3) ․Rasterization systems generally use a transformation stack to move the stream of input vertices into place. The transformation stack is a standard stack which stores matrices. Incoming vertices are multiplied by the matrix stack. As an illustrative example, imagine a simple scene with a single model of a person. The person is standing upright, facing an arbitrary direction while his head is turned in another direction. The person is also located at a certain offset from the origin. A stream of vertices, the model, would be loaded to represent the person. First, a translation matrix would be pushed onto the stack to move the model to the correct location. A scaling matrix would be pushed onto the stack to size the model correctly. A rotation about the y-axis would be pushed onto the stack to orient the model properly. Then, the stream of vertices representing the body would be sent through the rasterizer. Since the head is facing a different direction, the rotation matrix would be popped off the top of the stack and a different rotation matrix about the y-axis with a different angle would be pushed. Finally the stream of vertices representing the head would be sent to the rasterizer. After all points have been transformed to their desired locations in 3-space with respect to the viewer, they must be transformed to the 2-D image plane. The orthographic projection, simply involves removing the z component from transformed 3d vertices. Orthographic projections have the property that all parallel lines in 3-space will remain parallel in the 2-D representation. However, real world images are perspective images, with distant objects appearing smaller than objects close to the viewer. A perspective projective transformation needs to be applied to these points. 大量、大額 ․Conceptually, the idea is to transform the perspective viewing volume into the orthogonal viewing volume. The perspective viewing volume is a frustum, that is, a truncated pyramid. The orthographic viewing volume is a 平截頭體 rectangular box, where both the near and far viewing planes are parallel to the image plane. 平截頭角錐體 ․A perspective projection transformation can be represented by the following 0 0 matrix: 1 0 F and N here are the distances of the far and near 0 1 0 0 viewing planes, respectively. The resulting four 0 0 ( F N ) / N F vector will be a vector where the homogeneous 1/ N 0 0 0 variable is not 1. Homogenizing the vector, or multiplying it by the inverse of the homogeneous variable such that the homogeneous variable becomes unitary, gives us our resulting 2-D location in the x and y coordinates. ․Clipping: Once triangle vertices are transformed to their proper 2d locations, some of these locations may be outside the viewing window, or the area on the screen to which pixels will actually be written. Clipping is the process of truncating triangles to fit them inside the viewing area. 修剪 截頭、去尾、截短 ․The common technique is the Sutherland-Hodgeman clipping algorithm: each of the 4 edges of the image plane is tested at a time. For each edge, test all points to be rendered. If the point is outside the edge, the point is removed. For each triangle edge that is intersected by the image plane’s edge, that is, one vertex of the edge is inside the image and another is outside, a point is inserted at the intersection and the outside point is removed. ․Scan conversion: The final step in the traditional rasterization process is to fill in the 2D triangles that are now in the image plane, also known as scan conversion. The first problem to consider is whether or not to draw a pixel at all. For a pixel to be rendered, it must be within a triangle, and it must not be occluded, or blocked by another pixel. The most popular algorithm of filling in pixels inside a triangle is the scanline algorithm. Since it is difficult to know that the rasterization engine will draw all pixels from front to back, there must be some way of ensuring that pixels close to the viewer are not overwritten by pixels far away. 遮蔽(不透光) ․The z buffer, the most common solution, is a 2d array corresponding to the image plane which stores a depth value for each pixel. Whenever a pixel is drawn, it updates the z buffer with its depth value. Any new pixel must check its depth value against the z buffer value before it is drawn. Closer pixels are drawn and farther pixels are disregarded. ․To find out a pixel's color, textures and shading calculations must be applied. A texture map is a bitmap that is applied to a triangle to define its look. Each triangle vertex is also associated with a texture and a texture coordinate (u,v) for normal 2-d textures in addition to its position coordinate. Every time a pixel on a triangle is rendered, the corresponding texel (or texture element) in the texture must be found-- done by interpolating between the triangle’s vertices associated texture coordinates by the pixels on-screen distance from the vertices. In perspective projections, interpolation is performed on the texture coordinates divided by the depth of the vertex to avoid a problem known as perspective foreshortening (a process known as perspective texturing). ․Before the final color of the pixel can be decided, a lighting calculation must be performed to shade the pixels based on any lights which may be present in the scene. There are generally three light types commonly used in scenes. ․Directional lights are lights which come from a single direction and have the same intensity throughout the entire scene. In real life, sunlight comes close to being a directional light, as the sun is so far away that rays from the sun appear parallel to Earth observers and the falloff is negligible. 降低、減少 ․Point lights are lights with a definite position in space and radiate light evenly in all directions. Point lights are usually subject to some form of attenuation, or fall off in the intensity of light incident on objects farther away. Real life light sources experience quadratic falloff. Finally, spotlights are like real-life spotlights, with a definite point in space, a direction, and an angle defining the cone of the spotlight. There is also often an ambient light value that is added to all final lighting calculations to arbitrarily compensate for global illumination effects which rasterization can not calculate correctly. 周圍的、周遭的 ․All shading algorithms need to account for distance from light and the normal vector of the shaded object with respect to the incident direction of light. The fastest algorithms simply shade all pixels on any given triangle with a single lighting value, known as flat shading. ․There is no way to create the illusion of smooth surfaces except by subdividing into many small triangles. Algorithms can also separately shade vertices, and interpolate the lighting value of the vertices when drawing pixels, known as Gouraud shading. The slowest and most realistic approach is to calculate lighting separately for each pixel, noted as Phong shading. This performs bilinear interpolation of the normal vectors and uses the result to do local lighting calculation. ․Acceleration techniques: To extract the maximum performance out of any rasterization engine, a minimum number of polygons should be sent to the renderer, culling out objects which can not be seen. 剔除 圈住 ․Backface culling: The simplest way to cull polygons is to cull all polygons which face away from the viewer, known as backface culling. Since most 3d objects are fully enclosed, polygons facing away from a viewer are always blocked by polygons facing towards the viewer unless the viewer is inside the object. A polygon’s facing is defined by its winding, or the order in which its vertices are sent to the renderer. A renderer can define either clockwise or counterclockwise winding as front or back facing. Once a polygon has been transformed to screen space, its winding can be checked and if it is in the opposite direction, it is not drawn at all. Note: backface culling can not be used with degenerate and unclosed volumes. 纏繞、纏捲 ․Using spatial data structures to cull out objects which are either outside the viewing volume or are occluded by other objects. The most common are binary space partitions, octrees, and cell and portal culling. 入口、門戶 ․Texture filtering, one of further refinements, to create clean images at any distance : Textures are created at specific resolutions, but since the surface they are applied to may be at any distance from the viewer, they can show up at arbitrary sizes on the final image. As a result, one pixel on screen usually does not correspond directly to one texel. ․Environment mapping is a form of texture mapping in which the texture coordinates are view-dependent. One common application, for example, is to simulate reflection on a shiny object. One can environment map the interior of a room to a metal cup in a room. As the viewer moves about the cup, the texture coordinates of the cup’s vertices move accordingly, providing the illusion of reflective metal. ․Bump mapping is another form of texture mapping which does not provide pixels with color, but rather with depth. Especially with modern pixel shaders, bump mapping creates the feel of view and lighting-dependent roughness on a surface to enhance realism greatly. ․Level of detail: Though the number of polygons in any scene can be phenomenal, a viewer in a scene will only be able to discern details of closeby objects. Objects right in front of the viewer can be rendered at full complexity while objects further away can be simplified dynamically, or even replaced completely with sprites. 幽靈 ․Shadow mapping and shadow volumes are two common modern techniques for creating shadows, taking object occlusion into consideration. ․Hardware acceleration: Most modern programs are written to interface with one of the existing graphics APIs, which drives a dedicated GPU. The latest GPUs feature support for programmable pixel shaders which drastically improve the capabilities of programmers. The trend is towards full programmability of the graphics pipeline. Graphics File Formats ․Bitmap is a two-dimensional array of values, and each element of the array corresponds to a single dot in a picture, i.e. a pixel. In most cases a pixel’s value is an index into a table of colors, indicating that the pixel is to be displayed using the color pointed to by the index. The colors in the table are collectively referred to as a palette. ․Termed bitmapped and vector formats. They are fundamentally different: the former stores a complete, digitally encoded image while the latter representing a picture as a series of lines, arcs circles, and text, something like “move to(100,100), select color blue, circle with radius 50”, etc. The main disadvantage is they can only reasonably describe a line drawing, not a photographic image. Metafiles: containing a list of image creation commands along with vectors and circles, and are really programs. Drawing the image it describe is impractical without access to the graphics package it depends on. ․If palette sizes used are16 or 256 colors, then the corresponding index sizes are 4 and 8 bits referred to as the number of bits per pixel. In a bitmap of 4 bits per pixel, each byte holds 2 separate pixel values. In a bitmap of 8 bits per pixel, each byte represents a single pixel. ․Bitmaps that represent very large numbers of colors simultaneously generally do not employ the palette scheme, but a pixel’s value directly defines a color. ․Pixel Ordering: The simplest is to store the pixels a row at a time. Each row is referred to as a scan line, and most often storing from left to right with rows from top to bottom. ․Image Geometry: Every computer image has an internal geometry used to position elements in the picture. The 2 most common are screen coordinates and graph coordinates. The former is commonly used for display and shown as the following left figure (the 2 scales may be different, IBM VGA for example.) (0,0) (1,0) (2,0) (0,1) (1,1) (2,1) (0,2) (1,2) …. …. Y screen(left) and graph coordinates(right) The latter is often used to be printed on a paper. (2,10) (10,3) X ․Bitmapped Graphics File Formats BMP Microsoft Windows Bitmap, general-purpose for bitmapped image GIF CompuServe Graphics Interchange Format, general-purpose to transmit image by modem–- utilizing data compression to reduce transmission times and also supporting interlaced image. TIFF Aldus/Microsoft Tagged Image File Format, complex, multipurpose and open-ended and supporting all types of bitmaps and bitmap-related measures. JEPG Joint Photographic Experts Group under the auspices of the ISO, fundamentally a bitmapped format. Instead of storing individual pixels, it stores blocks of data that can be approximately to reconstruct blocks of pixels, and also called lossy. ․Interleaving: The simplest is to store the even numbered rows, then the odd rows, i.e. 0, 2, 4, …, 1, 3, 5, ... Or maybe 0, 2, 4, …, 98, 99, 97, 95, …, 3, 1 supposing there were a total of 100 rows. The original point is to match the order of scan lines used on TV, i.e. even-downodd-up. Another advantage is one can quickly construct an approximate version of the image without having to read the whole file. ․GIF uses a four-way interleave that first stores every eighth row, then three more sets of rows, each of which fills in the rows halfway between the ones already stored. GIF is copyrighted but freely used, and employs patented LZW compression. ․The most practical approach to dealing with a bitmapped image is to treat it as a collection of scan lines--- writing functions that read and write scan lines, display scan lines, and the like. (下面為補充) **************************************************************************************************************** ․Most adaptive-dictionary-based techniques have theirs roots in two landmark papers by Jacob Ziv and Abraham Lempel in 1977 and 1978. That is what we call the LZ77 family (also known as LZ1), and this LZ78 or LZ2 family. The most well-known modification of LZ2 is the one by Terry Welch, known as LZW. ․LZW 稱為辭典式壓縮法, 將重覆性較高的原始資料或是字串, 利用索引值編碼, 取代原始資料達到壓 縮目的; 因為簡捷, 適合硬體計算, 例如字元a, 以一個索引值100, 代替原字元輸出。若有新字元, 即 增加索引且重新編碼, 減少資料體積達到壓縮目的; 多用於大面積純色影像, 在TIFF、PDF、GIF和 PostScript等格式中, 皆有支援LZW。缺點是增加儲存時間。 ․Printer Data File: 2 general types, namely extended text formats and page description languages. The former embed picture information inside a convention text stream; that is, plain text prints as itself, and escape sequences introduce nontext elements, PCL of Hewlett-Packard’s being a de facto standard for low- to medium-performance laser printers for example. The other is to define an entirely new language to describe what is to be printed on the page, PostScript becoming the standard description language for example. Converting File Types 1) bitmap to bitmap, one reads a file format, extracts the array of pixels, and then writes the same pixels in any other format, PBM utilities (PGM, PPM) supporting the transformations for example. Image transformation of this kind has nothing to do with file processing per se! ․Promoting from a less expressive format to a more expressive format does nothing at all– a white pixel remains white, a 50 percent gray remains a 50 percent gray, and so forth. Conversion in reverse direction is not easy. The goal is to produce the best-looking image possible given the limitations of the new format. 2) Color to Gray Conversion: For each pixel, one need only determine the pixel’s luminance, a value conventionally computed from the component values as Y(or L) in slide 2. ․Color Quantization: Sometimes one has a color image with more colors than the local hardware can handle, such as a full-color image to be displayed on a screen that can only show 64 or 256 colors. A process called quantization selects a representative set of colors and then assigns one of those colors to each pixel. For example, that a digitalized photograph using 246 gray scale values to be displayed on a screen with 3 bits per pixel, a show of 8 colors, is much coarser but still recognizable. ․Dithering: The limited number of colors is quite noticeable in areas with gradual changes of color. One way to decrease this effect is by dithering, spreading the quantization error around from pixel to pixel to avoid unwanted step effects. It turns out to be a much smoother image than the previous example by using 8 colors and also dithering. 3) Vector to vector conversion reconciles the slightly different semantics of different formats and, to some degree, handling coordinate systems. For example, a ‘circle’ command in the original turns into a ‘circle’ command in the translated file. Problems arise when the two formats don’t have corresponding commands. One might approximate it or simulate it with a series of short line segments. 4) Vector to bitmap rasterization is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image (pixels or dots) for output on a video display or printer, or for storage in a bitmap file format. Rasterization refers to the popular rendering algorithm for displaying three-dimensional shapes on a computer. Real-time applications need to respond immediately to user input, and generally need to produce frame rates of at least 25 frames per second to achieve smooth animation. Rasterization is simply the process of computing the mapping from scene geometry to pixels and does not prescribe a particular way to compute the color of those pixels. Shading, including programmable shading, may be based on physical light transport, or artistic intent. 5) Bitmap to vector conversion is more different than any of the previous types. Determining a File Format ․A frequent problem in graphics file processing is determining the format of a particular file. The easiest but least reliable way to do so is to use the file’s extension. On most systems, a PCX file name ends with .PCX, a TIFF file with .TIF or .TIFF and what not. A more reliable technique is what’s known as the magic number approach. Nearly all file formats have an identifiable byte string, either by deliberate design or by fortunate coincidence, at or near the beginning of the file. ․A table of common types and magic numbers follows. The length and offset are in decimal, with offset 0 being the beginning of the file. The magic numbers are written as pairs of characters, meaning hex byte values, or as single character, meaning the literal ASCII characters, and are written in the actual order the bytes appear in the file. format offset length value MacPaint PCX GEM IMG IFF/ILBM BMP Targa 0 0 0 0 0 0 4 1 4 4 2 1 GIF 87 GIF 89 JFIF 0 0 0 6 6 11 HP-GL 0 4 WMF 0 6 PCL 0 0 4 2 PostScript PBM PGM PPM 0 0 0 0 2 2 2 2 comments 00 00 00 02 ……… version number field, not always set correctly 0a 00 01 00 08 ………. fourth byte may also 09 FORM BM 00 No magic at front and the first in the being the length of the image ID--almost always 0, since most files don’t have an ID. More advanced editions have an identifying string at the end of the file, which is “TRUEVISIONTARGA” followed by a period and a byte of binary zeros. G I F 8 7 a ………… no extensions GIF89a FF D8 FF E0 xx xx the xx byte vary from file to file 4A 46 49 46 00 1B % 0 A .………. Files usually start with an HP-GL command, which is 2 uppercase letters followed by a digit or semicolon. ( to be printed on a PCL5 starts with this string to switch to HP-GL mode--- the 0 is sometimes replaced by 1, 2, or 3. ) 01 00 09 00 00 03 The 9 is the header size, which may in theory be larger if the header is extended. The 03 is the windows version number. D7 CD C6 9A …..… placeable WMF header 1 B E ……….…… Most beginning with this printer reset sequence. ( some beginning with a longer sequence starting 1B and an asterisk. ) % ! ………….……. Header string not mandatory, but found in nearly all files. P1 P 4 for raw PBM P 2 ………………. P 5 for raw PBM P3 P 6 for raw PBM Refraction ․Refraction is the change in direction of a wave due to a change in its speed. This is most commonly observed when a wave passes from one medium to another. Refraction of light is the most commonly observed phenomenon, but any type of wave can refract when it interacts with a medium, for example when sound waves pass from one medium into another or when water waves move into water of a different depth. Refraction is described by Snell's law, which states that the angle of incidence θ1 is related to the angle of refractionθ2 by , where v1 and v2 are the wave velocities in the respective media, and n1 and n2 the refractive indices. In general, the incident wave is partially refracted and partially reflected; the details of this behavior are described by the Fresnel equations. ․The refractive index of a transparent substance or material is defined as the relative speed at which light moves through the material with respect to its speed in a vacuum. By convention, An image of the Golden Gate Bridge is refracted and bent by many differing three dimensional pools of water the refractive index of a vacuum is defined as having a value of 1.0, which serves as a universally accepted reference point. The index of refraction of other transparent materials, commonly identified by the variable n, is defined through the equation: n (Refractive Index) = c/v , where c is the speed of light in a vacuum and v is the velocity of light in the material. Refraction in a Perspex (acrylic) block. Reflection mapping (下面為補充) ․In computer graphics, reflection mapping is an efficient method of simulating a complex mirroring surface by means of a precomputed texture image. The texture is used to store the image of the environment surrounding the rendered object. There are several ways of storing the surrounding environment; the most common ones are the Spherical Environment Mapping in which a single texture contains the image of the surrounding as reflected on a mirror ball, or the Cubic Environment Mapping in which the environment is unfolded onto the six faces of a cube and stored therefore as six square textures. ․This kind of approach is more efficient than the classical ray tracing approach of computing the exact reflection by shooting a ray and following its optically exact path, but An example of these are (sometimes crude) approximations of the real reflection mapping. reflection. Another important advantage is that it's the only way to create reflections of real-world backgrounds in synthetic objects. A typical drawback of this technique is the absence of self reflections: you cannot see any part of the reflected object inside the reflection itself. ․Spherical environment mapping (sometimes known as standard environment mapping) involves the use of a textured hollow sphere whose inside surface has no parallax(視差) in relation the object that reflects it (i.e. every point on the surface of the object reflects the same spherical data). A spherical texture is created, using a fisheye lens or via prerendering a preexisting virtual scene, and is mapped onto the sphere. Pixel colors in the final rendering pass are determined by calculating the reflection vectors from the points on the object to the texels in the environment map. This technique often produces results which are superficially similar to those generated by raytracing, but is less computationally expensive due to the colors of the points to be referenced being known beforehand, simplifying the GPU workload down to calculating the angles of incidence and reflection. ․There are limitations to spherical mapping that detract from their realism. Because spherical maps are stored as azimuthal projections of the environments they represent, there is an abrupt point of singularity (a “black hole” effect) visible in the reflection on the object where texel colors at or near the edge of the map are distorted due to inadequate resolution to represent the points accurately. ․Cube mapping was developed to address this issue. If cube maps are made and filtered correctly, they have no visible seams (see below for detailed explanation). They have since superseded sphere maps in many contemporary graphical applications, namely realtime rendering. ․Cube mapped reflection is a technique that uses cube mapping to make objects look like they reflect the environment around them. Generally, this is done with the same skybox that is used in outdoor renderings. Although this is not a true reflection since objects around the reflective one will not be seen in the reflection, the desired effect is usually achieved. A diagram depicting an reflection being provided by cube mapped reflection. The map is projected onto the surface from the point of view of the observer. Highlights which in raytracing would be provided by tracing the ray and determining the angle made with the normal, can be 'fudged', if they are manually painted into the texture field (or if they already appear there depending on how the texture map was obtained), from where they will be projected onto the mapped object along with the rest of the texture detail. ․Cube mapped reflection is done by determining the vector that the object is being viewed at. This camera ray is reflected about the surface normal of where the camera vector intersects the object. ․This results in the reflected ray which is then passed to the cube map to get the texel which the camera then sees as if it is on the surface of the object. This creates the effect that the object is reflective. ․HEALPix environment mapping, is a technique basically like cube mapping, but it uses a HEALPix map because it preserves better details than a cube map does. ․Application in real-time 3D graphics: Cube mapped reflection, when used correctly, may be the fastest method of rendering a reflective surface. To increase the speed of rendering, each vertex calculates the position Example of a three-dimensional of the reflected ray. Then, the position is model using cube mapped reflection interpolated across polygons to which the vertex is attached. This eliminates the need for recalculating every pixel's reflection. ․If normal mapping is used, each polygon has many face normals (the direction a given point on a polygon is facing), which can be used in tandem with an environment map to produce a more realistic reflection. In this case, the angle of reflection at a given point on a polygon will take the normal map into consideration. This technique is used to make an otherwise flat surface appear textured, for example corrugated metal, or brushed aluminium. ․HEALPix mapped reflection, like cube mapping, is the fastest method of rendering a reflective surface when used correctly. Lambertian reflectance ․If a surface exhibits Lambertian reflectance, light falling on it is scattered such that the apparent brightness of the surface to an observer is the same regardless of the observer's angle of view. More technically, the surface luminance is isotropic. For example, unfinished wood exhibits roughly Lambertian reflectance, but wood finished with a glossy coat of polyurethane (聚氨酯) does not, since specular highlights may appear at different locations on the surface. Not all rough surfaces are perfect Lambertian reflectors, but this is often a good approximation when the characteristics of the surface are unknown. Lambertian reflectance is named after Johann Heinrich Lambert. ․In computer graphics, Lambertian reflection is often used as a model for diffuse reflection. This technique causes all closed polygons (such as a triangle within a 3D mesh) to reflect light equally in all directions when rendered. The effect this has from the viewer's perspective is that rotating or scaling the object does not change the apparent brightness of its surface. The reflection is calculated by taking the dot product of the surface's normalized normal vector N, and a normalized light-direction vector L, pointing from the surface to the light source. This number is then multiplied by the color of the surface and the intensity of the light hitting the surface: ID = L∙NCIL where ID is the intensity of the diffusely reflected light (surface brightness), C is the color and IL is the intensity of the incoming light. Because L∙N = |N||L|cosθ , where α is the angle between the direction of the two vectors, the intensity will be the highest if the normal vector points in the same direction as the light vector (cos0 = 1, the surface will be perpendicular to the direction of the light), and the lowest if the normal vector is perpendicular to the light vector (cos(π / 2) = 0, the surface runs parallel with the direction of the light). ․Lambertian reflection is typically accompanied by specular reflection, where the surface luminance is highest when the observer is situated at the perfect reflection direction, and falls off sharply. This is simulated in computer graphics with various specular reflection models such as Phong, CookTorrance. etc. Spectralon (a thermoplastic resin that can be machined into a wide variety of shapes for the fabrication of optical components. Spectralon gives the highest diffuse reflectance of any known material.) is a material which is designed to exhibit an almost perfect Lambertian reflectance, while Scotchlite(反光材料為一表面具有回歸反射介子( Retro-reflector )之材質，其具 有能將光自光源之來向反射回去之特性，是專門為安全而設計的產品，利用每 平方英吋千萬個立方體的完全反射，在光線昏暗或晚上，大大增加穿著者的可 見度◦) is a material designed with the opposite intent of only reflecting light on one line of sight. While Lambertian reflectance usually refers to the reflection of light by an object, it can be used to refer to the reflection of any wave. For example, in ultrasound imaging, "rough" tissues are said to exhibit Lambertian reflectance. Lambert's cosine law ․In optics, Lambert's cosine law says that the radiant intensity observed from a "Lambertian" surface is directly proportional to the cosine of the angle θ between the observer's line of sight and the surface normal. The law is also known as the cosine emission law or Lambert's emission law. ․An important consequence of Lambert's cosine law is that when such a surface is viewed from any angle, it has the same apparent radiance. This means, for example, that to the human eye it has the same apparent brightness (or luminance). It has the same radiance because, although the emitted power from a given area element is reduced by the cosine of the emission angle, the size of the observed area is decreased by a corresponding amount. Therefore, its radiance (power per unit solid angle per unit projected source area) is the same. For example, in the visible spectrum, the Sun is not a Lambertian radiator; its brightness is a maximum at the center of the solar disk, an example of limb darkening. A black body is a perfect Lambertian radiator. ․Lambertian scatterers: When an area element is radiating as a result of being illuminated by an external source, the irradiance (energy or photons/time/area) landing on that area element will be proportional to the cosine of the angle between the illuminating source and the normal. A Lambertian scatterer will then scatter this light according to the same cosine law as a Lambertian emitter. This means that although the radiance of the surface depends on the angle from the normal to the illuminating source, it will not depend on the angle from the normal to the observer. For example, if the moon were a Lambertian scatterer, one would expect to see its scattered brightness appreciably diminish towards the terminator due to the increased angle at which sunlight hit the surface. The fact that it does not diminish illustrates that the moon is not a Lambertian scatterer, and in fact tends to scatter more light into the oblique angles than would a Lambertian scatterer. Details of equal brightness effect: The situation for a Lambertian surface (emitting or scattering) is illustrated in Figures 1 and 2. For conceptual clarity, we will think in terms of photons rather than energy or luminous energy. The wedges in the circle each represent an equal angle d Ω and, for a Lambertian surface, Figure 1: Emission rate (photons/s) in a the number of photons per second emitted into normal and off-normal direction. The each wedge is proportional to the area of the number of photons/sec directed into any wedge. wedge is proportional to the area of the It can be seen that the length of each wedge wedge. is the product of the diameter of the circle and cosθ. It can also be seen that the maximum rate of photon emission per unit solid angle is along the normal and diminishes to zero for θ = 90°. In mathematical terms, the radiance along the normal is I photons/(s·cm2·sr) and the number of photons per second emitted into the vertical wedge is I dΩ dA. The number of photons per second emitted into the wedge at angle θ is I cos(θ) dΩ dA. Figure 2 represents what an observer sees. The observer directly above the area element will be seeing the scene through an aperture of area dA0 and the area element dA will subtend a (solid) angle of dΩ0. We can assume without loss of generality that the aperture happens to subtend solid angle dΩ when "viewed" from the emitting area element. This normal observer will then be recording I dΩ dA photons per second and so will be measuring a radiance of photons/s∙cm2∙sr. Figure 2: Observed intensity (photons/(s·cm2·sr)) for a normal and off-normal observer; dA0 is the area of the observing aperture & dΩ is the solid angle subtended The observer at angle θ to the normal will be by the aperture from the viewpoint seeing the scene through the same aperture of of the emitting area element. area dA0 and the area element dA will subtend a (solid) angle of dΩ0 cosθ. This observer will be recording I cosθ dΩ dA photons per second, and so will be measuring a radiance of photons/s∙cm2∙sr, which is the same as the normal observer. specular highlight ․A specular highlight is the bright spot of light that appears on shiny objects when illuminated (for example, see image at right). Specular highlights Specular highlights on a pair of are important in 3D computer graphics, as they spheres. provide a strong visual cue for the shape of an object and its location with respect to light sources in the scene. ․Microfacets: The term specular means that light is perfectly reflected in a mirror-like way from the light source to the viewer. Specular reflection is visible only where the surface normal is oriented precisely halfway between the direction of incoming light and the direction of the viewer; this is called the half-angle direction because it bisects (divides into halves) the angle between the incoming light and the viewer. Thus, a specularly reflecting surface would show a specular highlight as the perfectly sharp reflected image of a light source. However, many shiny objects show blurred specular highlights. This can be explained by the existence of microfacets. We assume that surfaces that are not perfectly smooth are composed of many very tiny facets, each of which is a perfect specular reflector. These microfacets have normals that are distributed about the normal of the approximating smooth surface. The degree to which microfacet normals differ from the smooth surface normal is determined by the roughness of the surface. ․The reason for blurred specular highlights is now clear. At points on the object where the smooth normal is close to the half-angle direction, many of the microfacets point in the half-angle direction and so the specular highlight is bright. As one moves away from the center of the highlight, the smooth normal and the half-angle direction get farther apart; the number of microfacets oriented in the half-angle direction falls, and so the intensity of the highlight falls off to zero. ․The specular highlight often reflects the color of the light source, not the color of the reflecting object. This is because many materials have a thin layer of clear material above the surface of the pigmented material. For example plastic is made up of tiny beads of color suspended in a clear polymer and human skin often has a thin layer of oil or sweat above the pigmented cells. Such materials will show specular highlights in which all parts of the color spectrum are reflected equally. On metallic materials such as gold the color of the specular highlight will reflect the color of the material. ․Models of microfacets: A number of different models exist to predict the distribution of microfacets. Most assume that the microfacet normals are distributed evenly around the normal; these models are called isotropic. If microfacets are distributed with a preference for a certain direction along the surface, the distribution is anisotropic. NOTE: In most equations, when it says (A∙B) it means Max(0, (A∙B)). ․Phong distribution: In the Phong reflection model, the intensity of the specular highlight is calculated as: or Where R is the mirror reflection of the light vector off the surface, and V is the viewpoint vector. ․In the Blinn–Phong shading model, the intensity of a specular highlight is calculated as: or , where N is the smooth surface normal and H is the half-angle direction (the direction vector midway between L, the vector to the light, and V, the viewpoint vector). ․The number n is called the Phong exponent, and is a user-chosen value that controls the apparent smoothness of the surface. These equations imply that the distribution of microfacet normals is an approximately Gaussian distribution, or approximately Pearson type II distribution, of the corresponding angle. While this is a useful heuristic and produces believable results, it is not a physically based model. ․A slightly better model of microfacet distribution can be created using a Gaussian distribution. The usual function calculates specular highlight intensity as: , where m is a constant between 0 and 1 that controls the apparent smoothness of the surface. ․A physically based model of microfacet distribution is the Beckmann distribution. This function gives very accurate results, but is also rather expensive to compute. , where m is the average slope of the surface microfacets. ․The Heidrich–Seidel distribution is a simple anisotropic distribution, based on the Phong model. It can be used to model surfaces that have small parallel grooves or fibers, such as brushed metal, satin, and hair. The specular highlight intensity for this distribution is: where n is the anisotropic exponent, V is the viewing direction, L is the direction of incoming light, and T is the direction parallel to the grooves or fibers at this point on the surface. If you have a unit vector D which specifies the global direction of the anisotropic distribution, you can compute the vector T at a given point by the following: , where N is the unit normal vector at that point on the surface. You can also easily compute the cosine of the angle between the vectors by using a property of the dot product and the sine of the angle by using the trigonometric identities. ․It should be noted that the anisotropic kspec should be used in conjunction with a non-anisotropic distribution like a Phong distribution to produce the correct specular highlight. ․The Ward anisotropic distribution uses two user-controllable parameters αx and αy to control the anisotropy. If the two parameters are equal, then an isotropic highlight results. The specular term in the distribution is: The specular term is zero if N·L < 0 or N·E < 0. All vectors are unit vectors. The vector V is the vector from the surface point to the eye, L is the direction from the surface point to the light, H is the half-angle direction, N is the surface normal, and X and Y are two orthogonal vectors in the normal plane which specify the anisotropic directions. ․The Cook–Torrance model uses a specular term of the form . Here D is the Beckmann distribution factor, and F is the Fresnel term, . For a performance reasons in a real-time 3D graphics Schlick's approximation is often used to approximate Fresnel term. G is the Geometric attenuation term, describing selfshadowing due to the microfacets, and is of the form . In these formulas E is the vector to the camera or eye, H is the half-angle vector, L is the vector to the light source and N is the normal vector, and α is the angle between H and N. ․Using multiple distributions: If desired, different distributions (usually, using the same distribution function with different values of m or n) can be combined using a weighted average. This is useful for modeling, for example, surfaces that have small smooth and rough patches rather than uniform roughness. Phong shading ․Phong shading refers to a set of techniques in 3D computer graphics. Phong shading includes a model for the reflection of light from surfaces and a compatible method of estimating pixel colors by interpolating surface normals across rasterized polygons. ․The model of reflection may also be referred to as the Phong reflection model, Phong illumination or Phong lighting. It may be called Phong shading in the context of pixel shaders or other places where a lighting calculation can be referred to as "shading". The interpolation method may also be called Phong interpolation, which is usually referred to by "per-pixel lighting". Typically it is called "shading" when contrasted with other interpolation methods such as Gouraud shading or flat shading. The Phong reflection model may be used in conjunction with any of these interpolation methods. ․Phong reflection is an empirical model of local illumination. It describes the way a surface reflects light as a combination of the diffuse reflection of rough Surfaces with the specular reflection of shiny surfaces. It is based on Bui Tuong Phong's informal observation that shiny surfaces have small intense specular highlights, while dull surfaces have large highlights that fall off more gradually. The reflection model also includes an ambient term to account for the small amount of light that is scattered about the entire scene. Visual illustration of the Phong equation: here the light is white, the ambient and diffuse colors are both blue, and the specular color is white, reflecting almost all of the light hitting the surface, but only in very narrow highlights. The intensity of the diffuse component varies with the direction of the surface, and the ambient component is uniform (independent of direction). For each light source in the scene, we define the components is and id as the intensities (often as RGB values) of the specular and diffuse components of the light sources respectively. A single term ia controls the ambient lighting; it is sometimes computed as a sum of contributions from all light sources. For each material in the scene, we define: ks: specular reflection constant, the ratio of reflection of the specular term of incoming light kd: diffuse reflection constant, the ratio of reflection of the diffuse term of incoming light (Lambertian reflectance) ka: ambient reflection constant, the ratio of reflection of the ambient term present in all points in the scene rendered α: is a shininess constant for this material, which is larger for surfaces that are smoother and more mirror-like. When this constant is large, the specular highlight is small. ․We further define lights as the set of all light sources, L as the direction vector from the point on the surface toward each light source, N as the normal at this point on the surface, R as the direction that a perfectly reflected ray of light would take from this point on the surface, and V as the direction pointing towards the viewer (such as a virtual camera). Then the Phong reflection model provides an equation for computing the shading value of each surface point Ip: Ip = KaIa+ ( K ( L N ) i K ( R V ) i ) d d s s light ․The diffuse term is not affected by the viewer direction (V). The specular term is large only when the viewer direction (V) is aligned with the reflection direction R. Their alignment is measured by the α power of the cosine of the angle between them. The cosine of the angle between the normalized vectors R and V is equal to their dot product. When α is large, in the case of a nearly mirror-like reflection, the specular highlight will be small, because any viewpoint not aligned with the reflection will have a cosine less than one which rapidly approaches zero when raised to a high power. When we have color representations as RGB values, this equation will typically be calculated separately for R, G and B intensities. Phong shading interpolation example ․Phong shading improves upon Gouraud shading and provides a better approximation of the shading of a smooth surface. Phong shading assumes a smoothly varying surface normal vector. The Phong interpolation method works better than Gouraud shading when applied to a reflection model that has small specular highlights such as the Phong reflection model. ․The most serious problem with Gouraud shading occurs when specular highlights are found in the middle of a large polygon. Since these specular highlights are absent from the polygon's vertices and Gouraud shading interpolates based on the vertex colors, the specular highlight will be missing from the polygon's interior. This problem is fixed by Phong shading. ․Unlike Gouraud shading, which interpolates colors across polygons, in Phong shading we linearly interpolate a normal vector across the surface of the polygon from the polygon's vertex normals. The surface normal is interpolated and normalized at each pixel and then used in the Phong reflection model to obtain the final pixel color. Phong shading is more computationally expensive than Gouraud shading since the reflection model must be computed at each pixel instead of at each vertex. In some modern hardware, variants of this algorithm are implemented using pixel or fragment shaders. This can be accomplished by coding normal vectors as secondary colors for each polygon, have the rasterizer use Gouraud shading to interpolate them and interpret them appropriately in the pixel or fragment shader to calculate the light for each pixel based on this normal information. ****************************************************************(下面為補充) ***************** ․Dithering can enhance the range and effectiveness of a display device having fewer color or intensity resources. Its fundamental purpose is, in a sense, to fool the human eye into thinking a dithered image contains more colors or gray levels than it actually has. ․The eye itself has a resolution limit of about 1 minute of arc. At a distance of 10~12 inches, a typical viewing distance for a printed page, this corresponds to a width of about 0.003 inches (and as it turns out, 0.003 inches is about the size of the dots printed by a 300-dpi laser printer). It means that two dots or objects that are closer together than 0.003 inches cannot be distinguished at that distance. On complex imagery, we don’t always ‘see’ to this resolution limit, the human brain performs spatial integration, averaging intensities in small areas to reduce the image’s apparent complexity. ․Another important property of the eye is its intensity response, which can be thought of as a graph of actual versus perceived intensity. This is not linear, but logarithmic. Thus, the perceived relative brightness of 2 light sources is based on the ratio of their intensities, not the difference of their intensities. The smallest intensity change the eye can detect is an intensity ratio of about 1.01, which has important implications for image rendering. ․A continuous-tone image is one in which intensity changes between adjacent image areas are fine enough that the eye cannot perceive discrete differences. In the case of a black and white image, this occurs when it is 1.01 or less. ․Dynamic Range: The ratio of the brightest perceived intensity to the least bright one is referred to as the medium’s dynamic range. ․There must be enough reproducible intensities, or grey levels, between the two intensity extremes such that successive intensities vary by a ratio of 1.01 or less in order to produce continuous-tone image. That is, the minimum number of required gray levels, n, is: n = log1.01 (1/ i0 ), where 1 meaning the max intensity and i0 the least intensity. Newsprint, for example, has a dynamic range of about 10 and, hence, the min number of intensity is 230. Ex: The paper printed in a laser printer or copier has a dynamic range of perhaps 20. How the 256-level grey image appear on the newsprint or on the copier? Is it continuous? A: Yes!

Descargar
# Neural Networks: Ch. 1 & Ch. 2