nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2012, 06, v.17;No.99 1-8
计算机视觉核心技术现状与展望
基金项目(Foundation): 国家自然科学基金资助项目(61202183)
邮箱(Email):
DOI: 10.13682/j.issn.2095-6533.2012.06.018
摘要:

数字图像和视频数据蕴含了丰富的视觉资源,如何智能化地提取和分析其中的有用信息逐渐成为近年的研究热点。计算机视觉(CV)技术在此学术背景下逐渐发展,并已经广泛应用于生产制造、智能安检、图像检索、医疗影像分析、人机交互等领域。与此同时,计算机视觉技术仍然面临诸如语义信息描述模糊、图像特征检测不稳定且效率低下等诸多问题。本文围绕计算机视觉的核心技术讨论了产生这些问题的根源,并基于对最新技术的讨论,描述了其基本理论框架,最后基于上述内容回顾各个时期重要的理论热点并讨论了计算机视觉的发展趋势。

Abstract:

Digital images and video data provide rich information on their content.The effective and efficient extraction and analysis of that information has been the core of computer vision(CV) research.In the last two decades,CV applications have been widely used in automated manufacturing,intelligent surveillance system,image retrieval and management,medical image analysis and human computer interactions.Although substantial progresses have been made in many areas,current CV techniques still suffer from problems such as low accuracy,slow speed and more prominently,semantic ambiguity.This paper provides a brief review of the CV development,core theoretical foundation,key techniques,successful applications as well as main challenges.The future trends have been also predicted at the end.

KeyWords:
参考文献

[1]Koenderink J J,Doorn A J.Affine structure from mo-tion[J].Journal of the optical society of America A,1991,8(2):377-385.

[2]Comaniciu D.Kernel-based object tracking[J].IEEEtransactions on pattern analysis and machine intelli-gence,2003,25(5):564-577.

[3]LI S Z,Jain A K.Handbook of face recognition[M].New York:Springer-Verlag,2011:353-383.

[4]Deng J,Berg A C,LI F F.Hierarchical semantic in-dexing for large scale image retrieval[C]//IEEE Con-ference on computer vision and pattern recognition,2011:785-792.

[5]Piras L,Giacinto G.Synthetic Pattern Generation forImbalanced Learning in Image Retrieval[J].PatternRecognition Letters,2012(In pressed).

[6]Witten I H,Frank E,Hall M A.Data Mining:Prac-tical machine learning tools and techniques[M].Mor-gan Kaufmann,2011:29-30.

[7]Narr D.Vision:A computational investigation into thehuman representation and processing of visual informa-tion[M].San Francisco:W.H.Freeman,1982:31-38.

[8]Moeslund T B,Granum E,A survey of computer vi-sion-based human motion capture[J].Computer visionand image understanding,2001,81(3):231-268.

[9]Swarup R,Kramer E M,Perry P,et al.Root grav-itropism requires lateral root cap and epidermal cellsfor transport and response to a mobile auxin signal[J].Nature cell biology,2005,7(11):1057-1065.

[10]Tan W T,Zakhor A.Real-time Internet video usingerror resilient scalable compression and TCP-friendlytransport protocol[J].IEEE Transactions on multime-dia,1999,1(2):172-186.

[11]Strother S,Casey M,Hoffman E,Measuring PETscanner sensitivity:relating countrates to image signal-to-noise ratios using noise equivalents counts[J].IEEE Trasactions on nuclear science,1990,37(2):783-788.

[12]Gokturk S B,Yalcin H,Bamji C,A time-of-flightdepth sensor-system description,issues and solutions[C]//Computer Vision and Pattern Recognition Work-shop,Conference on CVPRW 2004:35-35.

[13]Knoop S,Vacek S,Dillmann R.Sensor fusion for 3dhuman body tracking with an articulated 3dbody mod-el[C]//IEEE International conference on robotics andautomation,2006:1686-1691.

[14]Wang J,Liu Z,Wu Y,et al.Mining actionlet ensem-ble for action recognition with depth cameras[C]//IEEE Conference on computer vision and pattern rec-ognition(CVPR),2012:1290-1297.

[15]Demetriou M K,Kiunalakis T,Vidakis N,et al.Fast3DScene Object Detection and Real Size Estimation u-sing Microsoft Kinect Sensor[C]//Signal processing,pattern recognition and applications.ACTA Press,2012:779.

[16]Schuon S,Theobalt C,Davis J,et al.High-qualityscanning using time-of-flight depth superresolution[C]//IEEE computer society conference on computervision and pattern recognition workshops,2008:1-7.

[17]Nixon M S,Aguado A S.Feature extraction and imageprocessing[M].Academic Press,2008:37-38.

[18]Niyogi S A,Adelson E H.Analyzing and recognizingwalking figures in XYT[C]//IEEE conference on com-puter vision and pattern recognition,1994:469-474.

[19]Wang J,Xu Z.Video event detection based on over-segmented STV regions[C]//IEEE international con-ference on computer vision workshops,2011:1464-1471.

[20]Wang J,Xu Z J.Video analysis based on volumetric e-vent detection[J].International journal of automationand computing,2010,7(3):365-371.

[21]Poppe R.A survey on vision-based human action rec-ognition[J].Image and Vision Computing,2010,28(6):976-990.

[22]Gorelick L,Blank M,Shechtman E,et al.Actions asspace-time shapes[J].IEEE Transactions on PatternAnalysis and Machine Intelligence,2007,29(12):2247-2253.

[23]Schuldt C,Laptev I,Caputo B.Recognizing humanactions:A local SVM approach[C]//Internationalconference on pattern recognition(Vol 3),2004:32-36.

[24]Hyvrinen A,Hurri J,Hoyer P O.Natural ImageStatistics:A probabilistic approach to early computa-tional vision(Vol 39)[M].New York:Springer-Ver-lag,2009:47.

[25]LI Fei-Fei,Perona P.A bayesian hierarchical modelfor learning natural scene categories[C]//IEEE inter-national conference on computer vision and pattern rec-ognition(Vol 2),2005:524-531.

[26]Cao Y,Wang C,Li Z,et al.Spatial-bag-of-features[C]//IEEE international conference on computer vi-sion and pattern recognition,2010:3352-3359.

[27]Yang J,Jiang Y G,Hauptmann A G,et al.Evalua-ting bag-of-visual-words representations in scene clas-sification[C]//International workshop on multimediainformation retrieval,ACM,2007:197-206.

[28]Lazebnik S,Schmid C,Ponce J.Beyond bags of fea-tures:Spatial pyramid matching for recognizing naturalscene categories[C]//IEEE international conference oncomputer vision and pattern recognition,2006:2169-2178.

[29]Farhadi A,Endres I,Hoiem D,et al.Describing ob-jects by their attributes[C]//IEEE international con-ference on computer vision and pattern recognition,2009:1778-1785.

[30]Yao B,Jiang X,Khosla A,et al.Human action rec-ognition by learning bases of action attributes and parts[C]//International conference on computer vision,2011:1331-1338.

[31]Laptev I,Marszalek M,Schmid C,et al.Learning re-alistic human actions from movies[C]//IEEE Confer-ence on Computer Vision and Pattern Recognition,2008:1-8.

[32]Parikh D,Grauman K.Relative Attributes[C]//Inter-national Conference on Computer Vision,2011:503-510.

[33]Parkash A,Parikh D.Attributes for classifier feed-back:2012[R].Technical Report,TTIC,2012-1.

[34]Szeliski R.Computer vision:algorithms and applica-tions[M].New York:Springer-Verlag,2011:109-127.

[35]Rosenfeld A,Pfaltz J L.Sequential operations in digit-al picture processing[J].Journal of the ACM,1966,13(4):471-494.

[36]Rosenfeld A,Kak A C.Digital picture processing[M].New York:Academic Press,1976:35.

[37]Winston P H.The psychology of computer vision[M].New York:McGraw-Hill,1975:99.

[38]Hanson A R,Riseman E M.Computer vision systems[M].New York:Academic Press,1978:30-39.

[39]Davis L.A survey of edge detection techniques[J].Computer graphics and image processing,1975,4(3):248-270.

[40]Nalwa V S.A guide tour of computer vision[M].MA:Addison-Weskey,1993:45-49.

[41]Fischler M A,Elschlager R A.The representation andmatching of pictorial structures[J].Computers,IEEETransactions on,1973,100(1):67-92.

[42]Barrow H G,Tenenbaum J M.Computational vision[J].Proceedings of the IEEE,1981,69(5):572-595.

[43]Felzenszwalb P F,Huttenlocher D P.Pictorial struc-tures for object recognition[J].International Journalof Computer Vision,2005,61(1):55-79.

[44]Rosenfeld A.Multiresolution image processing and a-nalysis[M].New York:Springer-Verlag,1984:16.

[45]Anandan P.A computational framework and an algo-rithm for the measurement of visual motion[J].Inter-national Journal of Computer Vision,1989,2(3):283-310.

[46]Adelson E H,Simoncelli E P,Hingorani R.Orthogo-nal pyramid transforms for image coding[M].Massa-chusetts Institute of Technology:Media Laboratory,1987:50-58.

[47]Terzopoulos D.Multilevel computational processes forvisual surface reconstruction[J].Computer Vision,Graphics,and Image Processing,1983,24(1):52-96.

[48]Poggio T,Koch C.Ill-posed problems in early vision:from computational theory to analogue networks[J].Proceedings of the Royal society of London(Series B):Biological sciences,1985,226(1244):303-323.

[49]Geman S,Geman D.Stochastic relaxation,Gibbs dis-tributions,and the Bayesian restoration of images[J].Pattern Analysis and Machine Intelligence,IEEETransactions on,1984(6):721-741.

[50]Poggio T,Gamble E B,Little J.Parallel integration ofvision modules[J].Science,1988,242(4877):436-440.

[51]Fischler M A,Firschein O.Readings in computer vi-sion:Issues,problems,principles,and paradigms[M].California,USA:Morgan Kaufmann Publish-ers,1987:203-210.

[52]Turk M,Pentland A.Eigenfaces for recognition[J].Journal of cognitive neuroscience,1991,3(1):71-86.

[53]Blake A,Isard M.Active contours(Vol 1)[M].Lon-don:Springer,2000:1-5.

[54]Chai H Y,Wee L K,Swee T T,et al.GLCM basedadaptive crossed reconstructed(ACR)k-mean cluste-ring hand bone segmentation[C]//Recent Researchesin Communications,Automation,Signal Processing,Nanotechnology,Astronomy and Nuclear Physics.2011:192-197.

[55]Khan Z H,Gu I,Backhouse A G.Robust visual ob-ject tracking using multi-mode anisotropic mean shiftand particle filters[J].Circuits and Systems for VideoTechnology,IEEE Transactions on,2011,21(1):74-87.

[56]Brown M,Lowe D G.Automatic panoramic imagestitching using invariant features[J].InternationalJournal of Computer Vision,2007,74(1):59-73.

[57]Jones A,McDowall I,Yamada H,et al.Renderingfor an interactive 360light field display[J].ACMTransactions on Graphics(TOG),2007,26(3):40.

[58]Levoy M,Hanrahan P.Light field rendering[C]//The23rd annual conference on Computer graphics and in-teractive technique,ACM,1996:31-42.

[59]Efros A A,Freeman W T.Image quilting for texturesynthesis and transfer[C]//ACM SIGGRAPH 2011conference proceedings,Los Angeles,California,USA,2001:341-346.

[60]Eisemann E,Durand F.Flash photography enhance-ment via intrinsic relighting[J].ACM Transactions onGraphics,2004,23(3):673-678.

[61]Agarwala A,Dontcheva M,Agrawala M,et al.Inter-active digital photomontage[J].ACM Transactions onGraphics,2004,23(3):294-302.

[62]Brendel W,Todorovic S.Learning spatiotemporalgraphs of human activities[C]//IEEE internationalconference on computer vision,2011:778-785.

[63]Wang J,Xu Z,O’Grady M.Head Curve Matchingand Graffiti Detection[C]//BMVC Workshop,2010:9.1-9.10.

[64]Newcombe R A,Lovegrove S J,Davison A J.DTAM:Dense tracking and mapping in real-time[C]//IEEE international conference on computer vi-sion,2011:2320-2327.

[65]Stone Z,Zickler T,Darrell T.Autotagging Facebook:Social network context improves photo annotation[C]//IEEE International conference on computer vi-sion and pattern recognition workshops,2008:1-8.

基本信息:

DOI:10.13682/j.issn.2095-6533.2012.06.018

中图分类号:TP391.41

引用信息:

[1]许志杰,王晶,刘颖等.计算机视觉核心技术现状与展望[J].西安邮电学院学报,2012,17(06):1-8.DOI:10.13682/j.issn.2095-6533.2012.06.018.

基金信息:

国家自然科学基金资助项目(61202183)

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文
检 索 高级检索