運(yùn)籌學(xué)方法、模型在生物信息學(xué)、系統(tǒng)生物學(xué)研究中的重要作用_第1頁
已閱讀1頁,還剩40頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、1,章祥蓀,復(fù)雜網(wǎng)絡(luò)的社團(tuán)結(jié)構(gòu)分析Community structure in complex networks,http://zhangroup.aporc.org中國科學(xué)院 數(shù)學(xué)與系統(tǒng)科學(xué)研究院全國復(fù)雜網(wǎng)絡(luò)會(huì)議,蘇州大學(xué), 2010,10, 17,Bio-molecular networks (生物分子網(wǎng)),許多生物問題, 特別是人類的疾病, 在分子層面上都可歸于 “systems problems” --

2、 Leroy Hood 許多生物問題可以表達(dá)成生物分子網(wǎng)絡(luò)(bio-molecular networks)的問題。生物分子網(wǎng)絡(luò)包括:蛋白質(zhì)相互作用網(wǎng)( protein interaction networks), 新陳代謝網(wǎng)(metabolic networks),基因調(diào)控網(wǎng)( gene regulatory networks), e.t.; 他們都有共同的性質(zhì) 更為有趣的是,許多這樣的網(wǎng)是“復(fù)雜”網(wǎng)絡(luò),2,3,,,復(fù)

3、雜網(wǎng)絡(luò)的典型代表:生物分子網(wǎng)絡(luò)之一 ---- 蛋白質(zhì)相互作用網(wǎng) (Scale-free),酵母細(xì)胞中的蛋白質(zhì)相互作用網(wǎng)絡(luò) (A.-L. Barabási, NATURE REVIEWS GENETICS, 2004),Jeong, 2000, Nature,包括太古代( Archae),細(xì)菌( Becterium), 真核生物(Eukaryote)在內(nèi)的43個(gè)物種的 新陳代謝網(wǎng)( Metabolic network )都

4、是 Scale-free的。,4,Protein-protein interaction networks,Rui-Sheng Wang, Yong Wang, Ling-Yun Wu, Xiang-Sun Zhang, Luonan Chen.Analysis on multi-domain cooperation for predicting protein-protein interactions.BMC Bioinforma

5、tics, 8:391, 2007 Shihua Zhang, Xue-Mei Ning and Xiang-Sun Zhang.Identification of functional modules in a PPI network by clique percolation clustering.Computational biology and chemistry, 30(6), 445-451, 2006. Luo

6、nan Chen, Ling-Yun Wu, Yong Wang and Xiang-Sun Zhang.Inferring Protein Interactions from Experimental Data by Association Probabilistic Method.Proteins: Structure, Function, and Bioinformatics, Vol. 62, pp. 833-837, 20

7、06. Xiang-Sun Zhang, Rui-Sheng Wang, Ling-Yun Wu, Shihua Zhang and Luonan Chen.Inferring Protein-Protein Interactions by Combinatorial Models.IFMBE Proceedings, Vol.14, 2006, 183--186, Springer Berlin Heidelberg.,5,M

8、etabolic and signaling networks,Zhenping Li, Rui-Sheng Wang, Xiang-Sun Zhang and Luonan Chen. Detecting drug targets with minimum side effects in metabolic networks. IET Systems Biology, 3(6), 523-533, 2009 Zhenping

9、Li, Rui-Sheng Wang, Xiang-Sun Zhang.Mass Flow Model and Essentiality of Enzymes in Metabolic Networks.Lecture Notes in Operations Research, 9, pp. 182-190, World Publishing Corporation, Lijiang, 2008. Jin G, Zhou X,

10、Wang H, Zhao H, Cui K, Zhang XS, Chen L, Hazen SL, Li K, Wong ST The Knowledge-Integrated Network Biomarkers Discovery for Major Adverse Cardiac Events. J Proteome Res 7(9): 4013-4021,2008,6,Luonan Chen,

11、 Rui-Sheng Wang, Xiang-Sun Zhang.Biomolecular Networks: Methods and Applications in Systems Biology.John Wiley & Sons, Hoboken, New Jersey. July, 2009.,Book about Biomolecular networks,7,可分成564 個(gè)模塊,由 950 個(gè)顯著的塊間相互作用

12、相連接。,Yeast functional linkage network,DNA damage module,SCIENCE Vol 306(26) 2004,,復(fù)雜網(wǎng)絡(luò)的動(dòng)態(tài)性質(zhì)研究 復(fù)雜網(wǎng)絡(luò)的靜態(tài)結(jié)構(gòu)研究小世界(Small world) ,尺度無關(guān)(Scale free),聚類特性 (Clustering) 的確切數(shù)學(xué)模型。 社團(tuán)結(jié)構(gòu) (Community Structure) …………,9,10,復(fù)雜網(wǎng)絡(luò)的

13、模塊化性質(zhì),復(fù)雜網(wǎng)絡(luò)中存在模塊或者社區(qū)結(jié)構(gòu) (Module or Community structure) 模塊或者社區(qū)定義為網(wǎng)絡(luò)中內(nèi)部連接稠密,與外部連接稀疏的節(jié)點(diǎn)的集合 (Filippo Radicchi et. al. PNAS, Vol.101, No.9, 2658-2663, 2004). 數(shù)學(xué)表述: 其中V是子圖,K是頂點(diǎn)的度。即子圖 V 是模塊的條件是模塊內(nèi)頂點(diǎn)的內(nèi)部連邊的度值之和大于模塊內(nèi)頂點(diǎn)的外

14、部連邊的度值之和。 PNAS ---- Proc. Natl. Acad. Sci. USA 美國科學(xué)院院刊,11,模塊劃分的重要性,許多復(fù)雜網(wǎng)絡(luò)共有的性質(zhì)。研究模塊結(jié)構(gòu)有助于研究整個(gè)網(wǎng)絡(luò)的結(jié)構(gòu)和功能,圣塔菲研究所的科學(xué)家合作網(wǎng):模塊代表從事相似領(lǐng)域研究的科學(xué)家集合,數(shù)學(xué)生態(tài)學(xué),統(tǒng)計(jì)物理,12,Martin Rosvall, Carl T. Bergstrom, PNAS, vol. 105, no.4. 1118-1123,

15、 2007,自然科學(xué)論文引用網(wǎng)絡(luò):6128期刊, 約600萬次引用,,劃分為88個(gè)模塊和3024條模塊間的連接,刻畫了學(xué)科之間的聯(lián)系,13,一個(gè)社會(huì)網(wǎng)絡(luò)的例子,1970年美國大學(xué)里的一個(gè)空手道俱樂部關(guān)系網(wǎng)絡(luò):節(jié)點(diǎn)是其34名成員,邊是他們兩年間的友誼關(guān)系,邊數(shù)為78。俱樂部里的矛盾導(dǎo)致其分裂為兩個(gè)小的俱樂部。問題是能否用網(wǎng)絡(luò)的模塊結(jié)構(gòu)來重現(xiàn)這個(gè)過程?它是模塊探測研究中的經(jīng)典例子。,W. W. Zachary, An informati

16、on flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 1977,Girvan, M, Newman, M., Proc. Natl. Acad. Sci, 2002Ravasz, E, Somera, A, Mongru, D, O

17、ltvai, Z, Barabasi, A., Science, 2002Radicchi, F, Castellano, C, Cecconi, F., Proc. Natl. Acad. Sci, 2004Guimera, R, Mossa, S, Turtschi, A., Proc. Natl. Acad. Sci, 2005Guimera, R, Amaral, L., Nature,

18、 2005Newman, M., Proc. Natl. Acad. Sci, 2006Rosvall, M, Bergstrom, C., Proc. Natl. Acad. Sci,

19、2007Fortunato, S, Barthelemy, M., Proc. Natl. Acad. Sci, 2007Weinan, E, Li, T, Vanden-Eijnden, E., Proc. Natl. Acad. Sci, 2008 Rosvall, M, Bergstrom, C., Proc. Natl. Acad. Sci,

20、 2008 Peter J. Mucha, et al., Science 2010 Yong-Yeol Ahn, James P. Bagrow & Sune Lehmann,Nature, 2010,生物信息學(xué)與最優(yōu)化方法,14,Importance of the to

21、pic,社團(tuán)結(jié)構(gòu)探索方法概述,A large number of methods have been developed for detecting communities, which can be generally categorized into local and global methods. Local methods (局部方法) for community detection identify a subset o

22、f nodes as a community according to certain local connection conditions, independently from the structure of the rest of the network. Such methods include clique overlap-based hierarchical clustering, clique percolati

23、on method, and sub-graph fitness method. Global methods (全局方法)for community detection optimize certain global quantitative functions encoding the quality of the overall partition of the network, such as information t

24、heoretical method, Potts model, and optimization of modularity measures.,15,16,,Shihua Zhang, Rui-Sheng Wang, and Xiang-Sun Zhang. Identification of overlapping community structure in complex networks using fuzzy c-mea

25、ns Clustering. Physica A, 2007, 374, 483–490.,Rui-Sheng Wang, Shihua Zhang, Yong Wang, Xiang-Sun Zhang, Luonan Chen. Clustering complex networks and biological networks by nonnegative matrix factorization with various s

26、imilarity measures. Neurocomputing, 2007,Shihua Zhang, Rui-Sheng Wang and Xiang-Sun Zhang. Uncovering fuzzy community structure in complex networks. Physical Review E, 76, 046103, 2007,我們小組在研究這一問題的早期發(fā)展了一些基于圖論和矩陣譜分解的模塊

27、探測算法 (local method),17,衡量網(wǎng)絡(luò)模塊化的指標(biāo)Q值,設(shè)網(wǎng)絡(luò)為 N=(V,E), Pk = { (V1, E1), …, (Vk, Ek)} 為一個(gè)分劃。L(Vi, Vj) =|Eij|, i in Vi, j in Vj.Newman 和 Girvan (Physical Review E, 2004) 提出一種衡量網(wǎng)絡(luò)社區(qū)結(jié)構(gòu)的指標(biāo) Q 值,18,指標(biāo)Q的問題 (Resolution limit)For

28、tunato and Barthélemy, PNAS, 2007,利用Q 劃分網(wǎng)絡(luò)的計(jì)算步驟: 目前很大一部分模塊探測的方法集中于利用各種啟發(fā)式算法來極大化Q值 ,例如模擬退火、遺傳算法等(Newman, PNAS, 2006; Guimera, Nature, 2005). Resolution limit 現(xiàn)象,19,極端例子:ring of cliques,,Fortunato &

29、Barthelemy, Proc. Natl. Acad. Sci. USA 104 (1), 36-41 (2007),20,提出新的模塊化指標(biāo)D值,模塊化密度函數(shù) D:,Zhenping Li, Shihua Zhang, Rui-Sheng Wang, Xiang-Sun Zhang, Luonan Chen, Quantitative function for community detection. Physical Re

30、view E, 77, 036109, 2008,21,D值克服了Q值存在的 resolution limit 問題,22,結(jié)果,Q值,D值,劃分正確的頂點(diǎn)的比例,錯(cuò)分現(xiàn)象---Misidentification,用Q或D作優(yōu)化可能得到不滿足定義的模塊,Q partitions the network into three communities (two Kn and one K5) when n>=16 (respectiv

31、ely, n>=21), in which K5 is a sub-graph violating all reasonable community definition.,Xiang-Sun Zhang, Rui-Sheng Wang, Yong Wang, Ji-Guang Wang, Yu-Qing Qiu, Lin Wang, and Luonan Chen. Modularity optimization in co

32、mmunity detection of complex networks. Europhysics Letters (EPL), 87, 2009. 被評為 EPL 2009 best paper,23,24,該文的主要貢獻(xiàn)是用離散凸規(guī)劃的概念對兩個(gè)重要問題進(jìn)行解析分析,Q 值和D 值的最優(yōu)化模型都是非線性整數(shù)規(guī)劃目標(biāo)函數(shù)的凸性和凹性無法解析得到對兩個(gè)具有特殊結(jié)構(gòu)的網(wǎng)絡(luò)進(jìn)行分析引入離散凸規(guī)劃(變量是離散的,可以嵌入一個(gè)連

33、續(xù)的凸規(guī)劃)的概念進(jìn)行分析, 得到解析解,所有對modularity進(jìn)行研究的論文(指上面所列的的PNAS,Nature,Sience文章)都是試題論證的,即沒有解析的證明.為了徹底分析resolution limit和 Misidentification 現(xiàn)象,我們對兩類典型網(wǎng)絡(luò)建立了優(yōu)化模型,引入了離散凸分析技術(shù),得到了兩類問題的解析解.,生物信息學(xué)與最優(yōu)化方法,25,,這兩個(gè)例子出現(xiàn)在PNAS中幾乎所有討論網(wǎng)絡(luò)模塊探測的論文里

34、,基于特殊結(jié)構(gòu)的凸分析,ad hoc network,ring of dense lumps,Finding 1 對,生物信息學(xué)與最優(yōu)化方法,28,Finding 2,Finding 3,解析解表明,對這兩個(gè)經(jīng)典的算例,Q和D都有Resolution limit和Misidentification的現(xiàn)象產(chǎn)生,所以Q 和D均只是近似的定量評估函數(shù)。 網(wǎng)絡(luò)社團(tuán)劃分的問題可以用一個(gè)優(yōu)化問題來精確 描述,我們證明

35、了這一模型是NP-hard的。 我們相信用優(yōu)化理論可以徹底解決網(wǎng)絡(luò)社團(tuán)劃分 的問題。網(wǎng)絡(luò)科學(xué)是運(yùn)籌學(xué)的下一個(gè)熱點(diǎn)。,29,直接得到團(tuán)環(huán)網(wǎng)絡(luò)的社團(tuán)結(jié)構(gòu)數(shù)值解,30,直接得到ad hoc網(wǎng)絡(luò)的社團(tuán)結(jié)構(gòu)數(shù)值解,31,,32,33,為了徹底解決這些問題,提出一個(gè)新的 OR 模型和相應(yīng)的算法,這一算法不會(huì)產(chǎn)生resolution limit 和 mis-identification 現(xiàn)象關(guān)鍵思路:模塊分劃質(zhì)量函數(shù)的定義要包含社團(tuán)

36、定義。,Xiang-Sun Zhang, Zhenping Li, Rui-Sheng Wang, Yong Wang. A combinatorial model and algorithm for globally searching community structure in complex networksJournal of Combinatorial Optimization (JCO), 2010.,DOI: 1

37、0.1007/s10878-010-9356-0,A new OR model,Problem definition: Given a network, the community identification problem is to partition the network into as many non-overlapping sub-networks as possible such that each sub-

38、network satisfies a given community definition. 給定一個(gè)網(wǎng)絡(luò)和一個(gè)社團(tuán)的定義,社團(tuán)結(jié)構(gòu)識(shí)別的問題就是將整個(gè)網(wǎng)絡(luò)分成盡可能多的滿足社團(tuán)定義的子網(wǎng)絡(luò)。,34,以上文字定義可以用一個(gè)整數(shù)線性規(guī)劃來描述,我們證明了這個(gè)模型是 NP-hard .,35,A qualified min-cut (QMC) algorithm,A heuristic principle is given t

39、o find a feasible partition with the largest number of communities.It is realized by a min-cut operation: A min-cut operation is called qualified if the two resulting sub-networks satisfy the module definition. T

40、he community identification problem can be solved based on a series of qualified min-cut operations.,36,Experiment results (artificial networks),Rings of cliques,Uneven ad-hoc network,37,Experiment results (real networks

41、),Football team network,Jazz musician network,38,學(xué)術(shù)性的,實(shí)用性的問題遠(yuǎn)遠(yuǎn)沒有解決,Yong-Yeol Ahn, James P. Bagrow & Sune Lehmann,Nature, 2010 Link communities reveal multiscale complexity in networks,39,致謝,This wor

42、k is cooperated with Dr. 李珍萍,Dr. 王瑞省,Dr. 王勇,Dr. 張世華, Dr. 王吉光,Dr. 張俊華This work is supported by 國家自然科學(xué)重點(diǎn)基金10631070 973項(xiàng)目2066CB503905 國家自然科學(xué)基金項(xiàng)目60873205,40,ThanksWelcome to

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論