	{"id":3823,"date":"2013-12-03T21:47:11","date_gmt":"2013-12-03T14:47:11","guid":{"rendered":"http:\/\/science-technology.vn\/?p=3823"},"modified":"2013-12-03T21:47:11","modified_gmt":"2013-12-03T14:47:11","slug":"hoc-ve-khoa-hoc-du-lieu","status":"publish","type":"post","link":"https:\/\/science-technology.vn\/?p=3823","title":{"rendered":"H\u1ecdc v\u1ec1 khoa h\u1ecdc d\u1eef li\u1ec7u"},"content":{"rendered":"<p><span style=\"font-size: 14px; line-height: 1.428571429;\">M\u1ed9t sinh vi\u00ean vi\u1ebft cho t\u00f4i: \u201cEm m\u00ea m\u1ea3i v\u1ec1 b\u00e0i b\u00e1o c\u1ee7a th\u1ea7y v\u1ec1 d\u1eef li\u1ec7u l\u1edbn v\u00e0 mu\u1ed1n h\u1ecdc th\u00eam v\u1ec1 n\u00f3. L\u00e0 m\u1ed9t sinh vi\u00ean n\u0103m th\u1ee9 nh\u1ea5t trong qu\u1ea3n l\u00ed h\u1ec7 th\u00f4ng tin, em kh\u00f4ng bi\u1ebft \u0111\u00e2y c\u00f3 ph\u1ea3i l\u00e0 l\u0129nh v\u1ef1c \u0111\u00fang \u0111\u1ec3 h\u1ecdc hay em ph\u1ea3i chuy\u1ec3n sang khoa h\u1ecdc m\u00e1y t\u00ednh? Em ph\u1ea3i h\u1ecdc ng\u00f4n ng\u1eef l\u1eadp tr\u00ecnh n\u00e0o trong khu v\u1ef1c n\u00e0y? Xin th\u1ea7y l\u1eddi khuy\u00ean.\u201d<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>\u0110\u00e1p: N\u1ebfu b\u1ea1n h\u1ecdc Qu\u1ea3n l\u00ed h\u1ec7 th\u00f4ng tin (ISM) b\u1ea1n \u1edf \u0111\u00fang l\u0129nh v\u1ef1c \u0111\u1ec3 theo \u0111u\u1ed5i ngh\u1ec1 trong Big Data r\u1ed3i. Trong khi c\u00f3 nhi\u1ec1u ng\u00f4n ng\u1eef l\u1eadp tr\u00ecnh \u0111\u01b0\u1ee3c d\u1ea1y trong \u0111\u1ea1i h\u1ecdc nh\u01b0 C++, Java, hay Python nh\u01b0ng nhi\u1ec1u \u1ee9ng d\u1ee5ng Big Data \u0111ang d\u00f9ng ng\u00f4n ng\u1eef kh\u00e1c c\u00f3 t\u00ean l\u00e0 \u201cR\u201d v\u00ec c\u00e1c c\u00f4ng c\u1ee5 Big Data nh\u01b0 Pig\/Hive v\u00e0 Hadoop ph\u1ea7n l\u1edbn \u0111\u01b0\u1ee3c vi\u1ebft trong ng\u00f4n ng\u1eef \u201cR\u201d. Ng\u00f4n ng\u1eef n\u00e0y \u0111\u01b0\u1ee3c thi\u1ebft k\u1ebf \u0111\u1eb7c bi\u1ec7t cho t\u00ednh to\u00e1n th\u1ed1ng k\u00ea v\u00e0 \u0111\u1ed3 ho\u1ea1 v\u00e0 \u0111\u01b0\u1ee3c d\u00f9ng r\u1ed9ng r\u00e3i trong c\u00e1c nh\u00e0 th\u1ed1ng k\u00ea Big data v\u00e0 c\u00e1c chuy\u00ean vi\u00ean khai ph\u00e1 d\u1eef li\u1ec7u \u0111\u1ec3 ph\u00e1t tri\u1ec3n ph\u1ea7n m\u1ec1m th\u1ed1ng k\u00ea v\u00e0 ph\u00e2n t\u00edch d\u1eef li\u1ec7u. \u201cR\u201d c\u0169ng h\u1ed9i t\u1ee5 v\u00e0o k\u0129 thu\u1eadt \u0111\u1ed3 ho\u1ea1 nh\u01b0 m\u00f4 h\u00ecnh tuy\u1ebfn t\u00ednh v\u00e0 phi tuy\u1ebfn, ph\u00e2n t\u00edch chu\u1ed7i th\u1eddi gian, v\u00e0 ph\u00e2n lo\u1ea1i.<\/p>\n<p>Tuy nhi\u00ean n\u1ebfu b\u1ea1n \u0111\u00e3 bi\u1ebft m\u1ed9t ng\u00f4n ng\u1eef t\u1ed1t, d\u1ec5 h\u1ecdc ng\u00f4n ng\u1eef kh\u00e1c. N\u1ebfu b\u1ea1n c\u00f3 th\u1ec3 l\u1eadp tr\u00ecnh trong Java hay C++, b\u1ea1n c\u00f3 th\u1ec3 h\u1ecdc R hay Python trong v\u00e0i tu\u1ea7n cho n\u00ean t\u00f4i s\u1ebd kh\u00f4ng qu\u00e1 b\u1eadn t\u00e2m v\u1edbi ng\u00f4n ng\u1eef l\u1eadp tr\u00ecnh. \u0110i\u1ec1u b\u1ea1n c\u1ea7n l\u00e0 hi\u1ec3u c\u00e1ch d\u1eef li\u1ec7u \u0111\u01b0\u1ee3c thu th\u1eadp, t\u1ed5 ch\u1ee9c v\u00e0 ph\u00e2n t\u00edch \u0111\u1ec3 tr\u00edch r\u00fat th\u00f4ng tin quan tr\u1ecdng.<\/p>\n<p>\u0110\u1ec3 b\u1eaft \u0111\u1ea7u, b\u1ea1n n\u00ean b\u1eaft \u0111\u1ea7u v\u1edbi Excel, ch\u01b0\u01a1ng tr\u00ecnh trang t\u00ednh l\u01b0u gi\u1eef, t\u1ed5 ch\u1ee9c v\u00e0 thao t\u00e1c d\u1eef li\u1ec7u. B\u1ea1n s\u1ebd h\u1ecdc \u0111\u01b0a d\u1eef li\u1ec7u v\u00e0o \u00f4 trang t\u00ednh, d\u00f9ng c\u00e1c c\u00f4ng th\u1ee9c to\u00e1n h\u1ecdc \u0111\u1ec3 t\u00ednh to\u00e1n hay thao t\u00e1c c\u00e1c d\u1eef li\u1ec7u n\u00e0y. \u0110\u00e2y l\u00e0 k\u0129 n\u0103ng c\u01a1 b\u1ea3n nh\u1ea5t m\u00e0 b\u1ea1n ph\u1ea3i bi\u1ebft c\u00e1ch th\u1ef1c hi\u1ec7n c\u00e1c ph\u00e9p to\u00e1n c\u01a1 b\u1ea3n \u0111\u1ec3 t\u00ecm c\u00e1c gi\u00e1 tr\u1ecb nh\u01b0 l\u1ee3i nhu\u1eadn v\u00e0 t\u1ed5n th\u1ea5t trong \u1ee9ng d\u1ee5ng t\u00e0i ch\u00ednh v.v. B\u1ea1n s\u1ebd h\u1ecdc c\u00e1ch t\u1ea1o ra hi\u1ec3n th\u1ecb \u0111\u1ed3 ho\u1ea1 t\u1eeb nh\u1eefng d\u1eef li\u1ec7u n\u00e0y hay t\u1ea1o ra b\u00e1o c\u00e1o.<\/p>\n<p>Sau khi b\u1ea1n \u0111\u00e3 l\u00e0m ch\u1ee7 kh\u00e1i ni\u1ec7m v\u1ec1 trang t\u00ednh Excel, b\u1ea1n c\u00f3 th\u1ec3 chuy\u1ec3n sang c\u01a1 s\u1edf d\u1eef li\u1ec7u \u0111\u1ec3 h\u1ecdc th\u00eam v\u1ec1 quan h\u1ec7 gi\u1eefa c\u00e1c d\u1eef li\u1ec7u n\u00e0y. B\u1ea1n s\u1ebd h\u1ecdc c\u00e1ch thu th\u1eadp d\u1eef li\u1ec7u, t\u1ed5 ch\u1ee9c ch\u00fang th\u00e0nh c\u00e1c t\u1ec7p v\u00e0 b\u1ea3n ghi, v\u00e0 l\u01b0u ch\u00fang \u0111\u1ec3 truy nh\u1eadp nhanh. B\u1ea1n c\u0169ng h\u1ecdc c\u00e1ch x\u00e2y d\u1ef1ng b\u1ea3ng v\u00e0 ch\u1ec9 s\u1ed1 \u0111\u1ec3 thao t\u00e1c c\u00e1c d\u1eef li\u1ec7u n\u00e0y v\u00e0 ph\u00e2n t\u00edch ch\u00fang \u0111\u1ec3 t\u1ea1o ra b\u00e1o c\u00e1o \u0111\u1eb7c bi\u1ec7t. B\u1ea1n h\u1ecdc d\u00f9ng H\u1ec7 qu\u1ea3n tr\u1ecb c\u01a1 s\u1edf d\u1eef li\u1ec7u (DBMS), ch\u01b0\u01a1ng tr\u00ecnh ph\u1ea7n m\u1ec1m gi\u00fap b\u1ea1n t\u1ea1o ra v\u00e0 qu\u1ea3n l\u00ed c\u00e1c c\u01a1 s\u1edf d\u1eef li\u1ec7u. B\u1ea1n c\u00f3 th\u1ec3 d\u00f9ng SQL server c\u1ee7a Microsoft, Oracle, hay MySQL \u0111\u1ec3 l\u00e0m ch\u1ee7 k\u0129 n\u0103ng c\u1ee7a b\u1ea1n trong qu\u1ea3n tr\u1ecb c\u01a1 s\u1edf d\u1eef li\u1ec7u. (C\u01a1 s\u1edf d\u1eef li\u1ec7u \u0111\u01b0\u1ee3c d\u1ea1y trong n\u0103m th\u1ee9 ba trong ch\u01b0\u01a1ng tr\u00ecnh ISM.)<\/p>\n<p>Sau khi b\u1ea1n h\u1ecdc v\u1ec1 c\u01a1 s\u1edf d\u1eef li\u1ec7u b\u1ea1n c\u00f3 th\u1ec3 chuy\u1ec3n sang h\u1ecdc v\u1ec1 Trinh s\u00e1t doanh nghi\u1ec7p (BI), \u1ee9ng d\u1ee5ng ph\u1ea7n m\u1ec1m \u0111\u01b0\u1ee3c d\u00f9ng \u0111\u1ec3 ph\u00e2n t\u00edch d\u1eef li\u1ec7u \u0111\u1ec3 nh\u1eadn di\u1ec7n th\u00f4ng tin c\u00f3 gi\u00e1 tr\u1ecb. BI \u0111\u01b0\u1ee3c t\u1ea1o n\u00ean t\u1eeb v\u00e0i ho\u1ea1t \u0111\u1ed9ng c\u00f3 li\u00ean quan nh\u01b0 khai ph\u00e1 d\u1eef li\u1ec7u, x\u1eed l\u00ed ph\u00e2n t\u00edch tr\u1ef1c tuy\u1ebfn, truy v\u1ea5n v\u00e0 l\u00e0m b\u00e1o c\u00e1o. Ng\u00e0y nay ph\u1ea7n l\u1edbn c\u00e1c c\u00f4ng ti \u0111\u1ec1u d\u00f9ng BI \u0111\u1ec3 c\u1ea3i ti\u1ebfn vi\u1ec7c ra quy\u1ebft \u0111\u1ecbnh qu\u1ea3n l\u00ed, gi\u1ea3m l\u00e3ng ph\u00ed hay nh\u1eadn di\u1ec7n c\u01a1 h\u1ed9i kinh doanh m\u1edbi. (BI th\u01b0\u1eddng \u0111\u01b0\u1ee3c d\u1ea1y \u1edf n\u0103m th\u1ee9 ba hay th\u1ee9 t\u01b0 trong ch\u01b0\u01a1ng tr\u00ecnh ISM.)<\/p>\n<p>Ph\u1ea7n l\u1edbn c\u00e1c ch\u01b0\u01a1ng tr\u00ecnh b\u1eb1ng c\u1eed nh\u00e2n trong Qu\u1ea3n l\u00ed h\u1ec7 th\u00f4ng tin (ISM) bao qu\u00e1t qu\u1ea3n l\u00ed c\u01a1 s\u1edf d\u1eef li\u1ec7u v\u00e0 trinh s\u00e1t doanh nghi\u1ec7p nh\u01b0ng \u0111\u1ec3 h\u1ecdc th\u00eam v\u1ec1 Big data, b\u1ea1n s\u1ebd c\u1ea7n ti\u1ebfp t\u1ee5c h\u1ecdc b\u1eb1ng th\u1ea1c s\u0129 trong khoa h\u1ecdc d\u1eef li\u1ec7u, ph\u00e2n t\u00edch d\u1eef li\u1ec7u qui m\u00f4 l\u1edbn hay khoa h\u1ecdc t\u00ednh to\u00e1n v.v. Trong nh\u1eefng ch\u01b0\u01a1ng tr\u00ecnh n\u00e0y b\u1ea1n s\u1ebd h\u1ecdc nhi\u1ec1u h\u01a1n v\u1ec1 khu\u00f4n kh\u1ed5 Big Data nh\u01b0 c\u00e1c c\u00f4ng c\u1ee5 Hadoop; c\u00e1c c\u00f4ng c\u1ee5 nh\u01b0 Pig v\u00e0 Hive, NonSQL, MapReduce v.v. T\u00f4i \u0111\u00e3 vi\u1ebft nhi\u1ec1u b\u00e0i b\u00e1o v\u1ec1 Big Data trong blog n\u00e0y m\u00e0 b\u1ea1n c\u00f3 th\u1ec3 \u0111\u1ecdc th\u00eam \u0111\u1ec3 bi\u1ebft th\u00eam v\u1ec1 Big Data.<\/p>\n<p>&nbsp;<\/p>\n<p>&#8212;English version&#8212;<\/p>\n<p>&nbsp;<\/p>\n<p>Learning about data science<\/p>\n<p>A student wrote to me: \u201cI am fascinated about your articles on big data and want to learn more about it. As a first year student in Information System Management, I do not know is this the right field to study or should I switch to Computer Science? Which programming language should I learn so I can work in this area? Please advice.\u201d<\/p>\n<p>&nbsp;<\/p>\n<p>Answer: If you study Information System Management (ISM) you are in the right field to pursue a career in Big Data. While there are many programming languages that are being taught in college such as C++, Java, or Python but many Big Data applications are using another language called \u201cR\u201d since Big Data tools such as Pig\/Hive and Hadoop are mostly written in \u201cR\u201d language. This language is designed especially for statistical computing and graphics and widely used among Big data statisticians and data mining specialist for developing statistical software and data analysis. \u201cR\u201d is also focusing on graphical techniques such as linear and nonlinear modeling, time-series analysis, and classification.<\/p>\n<p>However if you already know one programming language well, it is easy to learn another. If you can program in Java or C++, you can learn R or Python in a matter of few weeks so I would not be too concerned with the programming language. What you need is to understand how data are collected, organized, and analyzed to extract the important information.<\/p>\n<p>To start, you should begin with Excel, the spreadsheet program that store, organize and manipulating data. You will learn to put data into spreadsheet cells, using mathematics formulas to calculate or manipulate these data. This is the most basic skill that you must know how to perform basic operations to find values such as profit or loss as in finance applications etc. You will learn how to create graphic display from these data or create reports.<\/p>\n<p>After you have mastered the concept of Excel spreadsheet, you can move on to database to learn more about the relationships between these data. You will learn how to collect data, organize them into files and records, and store them for quick access. You also learn how to build table and index to manipulate these data and analyze them to create special reports. You learn to use Database Management System (DBMS), the software program that help you to create and manage of databases. You may use Microsoft\u2019s SQL server, Oracle, or MySQL to master your skill in database management. (Database is often taught in the third year in the ISM program)<\/p>\n<p>After you learn about database you may move on to learn about Business intelligence (BI), the software applications used to analyze data to identify valuable information. BI is made up of several related activities such as data mining, online analytical processing, querying and reporting. Today most companies are using BI to improve management decision making, reduce wastes or identify new business opportunities. (BI is often taught in the third or fourth year in the ISM program)<\/p>\n<p>Most bachelor\u2019s degree programs in Information System Management (ISM) cover database management and business intelligent but to learn more about Big data, you will need to continue to the Master\u2019s degree in Data science, Large Scale data analysis or Computational science etc. In these programs you will learn more about Big Data frameworks such as Hadoop; tools such as Pig and Hive, NonSQL, MapReduce etc. I have written several articles on Big Data in this blog that you can read further to learn more about Big Data.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>M\u1ed9t sinh vi\u00ean vi\u1ebft cho t\u00f4i: \u201cEm m\u00ea m\u1ea3i v\u1ec1 b\u00e0i b\u00e1o c\u1ee7a th\u1ea7y v\u1ec1 d\u1eef li\u1ec7u l\u1edbn v\u00e0 mu\u1ed1n h\u1ecdc th\u00eam v\u1ec1 n\u00f3. L\u00e0 &hellip; <\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-3823","post","type-post","status-publish","format-standard","hentry","category-quan-li-he-thong-tin"],"_links":{"self":[{"href":"https:\/\/science-technology.vn\/index.php?rest_route=\/wp\/v2\/posts\/3823","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/science-technology.vn\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/science-technology.vn\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/science-technology.vn\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/science-technology.vn\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3823"}],"version-history":[{"count":2,"href":"https:\/\/science-technology.vn\/index.php?rest_route=\/wp\/v2\/posts\/3823\/revisions"}],"predecessor-version":[{"id":3825,"href":"https:\/\/science-technology.vn\/index.php?rest_route=\/wp\/v2\/posts\/3823\/revisions\/3825"}],"wp:attachment":[{"href":"https:\/\/science-technology.vn\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3823"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/science-technology.vn\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3823"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/science-technology.vn\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3823"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}