版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認(rèn)領(lǐng)
文檔簡介
1、中文 中文 4300 字, 字,2400 英文單詞, 英文單詞,1.3 萬英文字符 萬英文字符出處: 出處:Ishibashi K, Iwasaki T, Otomasa S, et al. Model selection for financial statement analysis: Variable selection with data mining technique [J]. Procedia Computer Scien
2、ce, 2016, 96(C):1681-1690.英 文: 文:Model selection for financial statement analysis: Variable selection with data mining techniqueKen Ishibashia, Takuya Iwasakia, Shota Otomasaa and Katsutoshi YadaaAbstractThe purpose of t
3、his study is to verify the effectiveness of a data-driven approach for financial statement analysis. In the area of accounting, variable selection for construction of models to predict firm’s earnings based on financial
4、statement data has been addressed from perspectives of corporate valuation theory, etc., but there has not been enough verification based on data mining techniques. In this paper, an attempt was made to verify the applic
5、ability of variable selection for the construction of an earnings prediction model by using recent data mining techniques. From analysis results, a method that considers the interaction among variables and the redundancy
6、 of model could be effective for financial statement data.Keywords: Financial statement analysis; earnings prediction model; model selection; variable selection; data mining1. IntroductionRecent advancement in informat
7、ion and communication technology is dramatically improving computational speeds. Under the circumstances, researchers have addressed studies focused on big data accumulated in various areas. Data mining techniques play a
8、n important role in data-driven analysis and modeling. Various methods related to data mining have been developed until now, and software such as SPSS and Weka has been developed to enable us to use them easily. However,
9、 for these applications, we generally need to select a method appropriate to data.The purpose of this study is to verify the effectiveness of a data-driven approach for the financial statement analysis. In the area of ac
10、counting, Ou and Penman (1989)1) addressed the construction of an earnings prediction model focused on financial statement data. They constructed a prediction model for the probability of a firm’s earnings increase in t
11、he subsequent fiscal year by using stepwise logistic regression analysis. By introducing variable selection, their prediction model used variables’ interactions that have not been proved theoretically. That is, it is
12、possible that they constructed an earnings prediction model using unusual information that other people do not have.The result of Ou and Penman (1989)1) has various problems related to the practical use of their method.
13、 In that research1), they did not state the reason why they applied logistic regression analysis to the model construction. Furthermore, follow-up studies2), 3) pointed out various problems through additional verificatio
14、ns of the model of Ou and Penman (1989)1). For example, Holthausen and Larcker (1992)2) applied the strategy of Ou and Penman (1989)1) to another fiscal period, but could not obtain anomalies of the probability of Relief
15、 is an instance-based attribute ranking scheme proposed by Kira and Rendell (1992)6), and later improved by Kononenko (1994)10). This method is applied to the estimation of a variable’s importance for the classification.
16、 In a classification of certain class, Relief decides a variable’s importance by focusing on instances located around the border of the class. From these instances, two instances are selected as near-miss and near-hit. T
17、he near-miss is an instance that is the closest to randomly selected samples but is not the same class as them. On the other hand, an instance selected as near-hit is the closest to them and is the same class. In Relief,
18、 the importance of a variable is decided based on the effectiveness for the classification of near-miss. Existing research5) showed that this method had large tolerance to noise but low redundancy.In the application of R
19、elief to variable selection, variables to adopt are generally decided by setting a threshold to their estimated ranks. In this study, the importance of variables is decided by 10-fold cross-validation, and we adopt varia
20、bles for which the “Merit” criterion for the classification is more than 0 are adopted.2.3. Correlation-based feature selectionCFS is a method that evaluates subsets of variables, not individual variables7). This method
21、searches subsets containing variables that are highly correlated with the class and have low inter-correlation with each other. CFS tends to be computationally cheap and choose small variables’ subsets, but it is difficu
22、lt to search solutions if there are strong variable interactions5).In this study, we use a Greedy algorithm to search for a subset that has the best CFS’s evaluation.2.4. Consistency-based subset evaluationCNS evaluates
23、variables’ subsets by using class consistency8). This method searches for combinations of variables which divide the data into subsets containing strong single class majority. Thus, this search tends to be biased in f
24、avor of small variable subsets with high-class consistency. Compared with CFS, CNS is useful if there are strong variable interactions, but the size of subset tends to be large5).In this study, CNS searches for subsets b
25、y using a Greedy algorithm like in CFS.2.5. C4.5 decision tree learnerC4.5 is a learning algorithm that constructs a decision tree by selecting variables appropriate to maximize the mutual information for classification9
26、). This method can avoid over-training to data by the function called “branch pruning”, which removes branches that have little mutual information or classify few instances. In the variable selection, variables contained
27、 in the decision tree are adopted as a subset of variables.In this study, a decision tree is constructed by using all training data for modeling, and then branches of which the number of classifying data is less than 50
28、are removed by the pruning. In this way, we obtain a subset with a size equivalent to CFS’s subsets.2.6. Stepwise methodIn existing research, Ou and Penman (1989)1) constructed an earnings prediction model by using stepw
29、ise logistic regression. Stepwise method is a conventional method that sequentially chooses variables to enhance evaluation criteria. In this method, the process of variable selection is very clear. However, because the
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 眾賞文庫僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- [雙語翻譯]財務(wù)外文翻譯--基于數(shù)據(jù)挖掘技術(shù)對財務(wù)報表分析模型的不同選擇(英文)
- [雙語翻譯]財務(wù)外文翻譯--基于數(shù)據(jù)挖掘技術(shù)對財務(wù)報表分析模型的不同選擇中英全
- 2016年財務(wù)外文翻譯--基于數(shù)據(jù)挖掘技術(shù)對財務(wù)報表分析模型的不同選擇(節(jié)選).DOCX
- 2016年財務(wù)外文翻譯--基于數(shù)據(jù)挖掘技術(shù)對財務(wù)報表分析模型的不同選擇
- 2016年財務(wù)外文翻譯--基于數(shù)據(jù)挖掘技術(shù)對財務(wù)報表分析模型的不同選擇(英文).PDF
- [雙語翻譯]財務(wù)報表舞弊外文翻譯--財務(wù)報表舞弊的檢測以法國公司為例(節(jié)選)
- [雙語翻譯]財務(wù)報表舞弊外文翻譯--企業(yè)文化與財務(wù)報表舞弊的發(fā)生
- 財務(wù)外文翻譯--基于財務(wù)報表分析企業(yè)價值
- [雙語翻譯]財務(wù)報表舞弊外文翻譯--企業(yè)文化與財務(wù)報表舞弊的發(fā)生 (英文)
- 外文翻譯--物流服務(wù)公司的財務(wù)報表分析(節(jié)選)
- 合并財務(wù)報表【外文翻譯】
- [雙語翻譯]財務(wù)報表舞弊外文翻譯--企業(yè)文化與財務(wù)報表舞弊的發(fā)生中英全
- 閱讀和分析財務(wù)報表【外文翻譯】
- 外文翻譯--對財務(wù)報表舞弊的思考
- 外文翻譯-- 企業(yè)并購財務(wù)報表分析
- 財務(wù)報表分析外文文獻翻譯
- [雙語翻譯]財務(wù)報表舞弊外文翻譯--財務(wù)報表舞弊的檢測以法國公司為例(英文)
- 對財務(wù)報表舞弊的思考【外文翻譯】
- 財務(wù)管理外文文獻翻譯--財務(wù)報表分析
- 財務(wù)報表分析外文文獻及翻譯
評論
0/150
提交評論