6.1 RMySQL数据库编程指南

问题

R语言如何访问MySQL?

引言

MySQL是一款最常用的开源数据库软件,安装简单(关于MySQL的安装和配置,请参考附录B),运行稳定,非常适用于中小型的数据存储。R作为数据分析的工具,当然要支持数据库驱动接口。让R和MySQL配合在一起,所能爆发出的能量是巨大的。

RMySQL是一个R语言程序包,提供了访问MySQL数据库的R语言接口程序,RMySQL的依赖于DBI包。RMySQL不仅提供了基本的数据库访问,SQL查询,还封装了一些方法,比如读整表、分页、data.frame快速插入等的功能。掌握好RMySQL,数据库编辑将得心应手!

由于操作系统的原因,Windows和Linux有不一样的字符集,不一样的运行时环境。所以,本节将分别介绍如何在Linux和Windows上面安装和使用RMySQL。

6.1.1 RMySQL在Linux下安装

Linux系统环境:

  • Linux: Ubuntu 12.04.2 LTS 64bit server
  • Linux字符集: en_US.UTF-8
  • R: 3.0.1, x86_64-pc-linux-gnu (64-bit)
  • MySQL: Ver 14.14 Distrib 5.5.29 64bit server
  • MySQL字符集: utf8

1. 查看Linux下的系统环境

~ uname -a  # Linux内核
Linux conan 3.5.0-23-generic #35~precise1-Ubuntu SMP Fri Jan 25 17:13:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
~ cat /etc/issue  # Linux版本
Ubuntu 12.04.2 LTS \n \l
~ locale   # Linux系统字符集设置LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
  ~ mysql --version   # 查看MySQL数据库服务器版本
mysql  Ver 14.14 Distrib 5.5.29, for debian-linux-gnu (x86_64) using readline 6.2
mysql> show variables like '%char%';     # 查看MySQL字符集
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

2. 在R环境中安装RMySQL

~ R   # 启动R程序

> install.packages('RMySQL')   # 安装RMySQL

# 省略部分输出

Configuration error:
  could not find the MySQL installation include and/or library
  directories.  Manually specify the location of the MySQL
  libraries and the header files and re-run R CMD INSTALL.

INSTRUCTIONS:

1. Define and export the 2 shell variables PKG_CPPFLAGS and
   PKG_LIBS to include the directory for header files (*.h)
   and libraries, for example (using Bourne shell syntax):
      export PKG_CPPFLAGS="-I"
      export PKG_LIBS="-L -lmysqlclient"
   Re-run the R INSTALL command:
      R CMD INSTALL RMySQL_.tar.gz

2. Alternatively, you may pass the configure arguments
      --with-mysql-dir= (distribution directory)
   or
      --with-mysql-inc= (where MySQL header files reside)
      --with-mysql-lib= (where MySQL libraries reside)
   in the call to R INSTALL --configure-args='...'

 R CMD INSTALL --configure-args='--with-mysql-dir=DIR' RMySQL_.tar.gz
ERROR: configuration failed for package ‘RMySQL’
* removing ‘/home/conan/R/x86_64-pc-linux-gnu-library/3.0/RMySQL’

The downloaded source packages are in
        ‘/tmp/Rtmpu0Gn88/downloaded_packages’
Warning message:
In install.packages("RMySQL") :
  installation of package ‘RMySQL’ had non-zero exit status

安装出错了,提示我们需要增加MySQL安装目录的配置参数 --with-mysql-dir。下面我们解决安装出错的问题。

~ sudo apt-get install libdbd-mysql libmysqlclient-dev    # 安装mysql类库
~ whereis mysql    # 找到mysql的安装目录
mysql: /usr/bin/mysql /etc/mysql /usr/lib/mysql /usr/bin/X11/mysql /usr/share/mysql /usr/share/man/man1/mysql.1.gz

~ ls /tmp/Rtmpu0Gn88/downloaded_packages    # 找到刚刚下载的文件,RMySQL_.tar.gz
DBI_0.2-7.tar.gz  RMySQL_0.9-3.tar.gz

~ R CMD INSTALL --configure-args='--with-mysql-dir=/usr/lib/mysql' /tmp/Rtmpu0Gn88/downloaded_packages/RMySQL_0.9-3.tar.gz  # 通过命令重新安装RMySQL

# 省略部分输出

installing to /home/conan/R/x86_64-pc-linux-gnu-library/3.0/RMySQL/libs
** R
** inst
** preparing package for lazy loading
Creating a generic function for ‘format’ from package ‘base’ in package ‘RMySQL’
Creating a generic function for ‘print’ from package ‘base’ in package ‘RMySQL’
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (RMySQL)

这次RMySQL安装成功。

3. 在MySQL中建库建表

~ mysql -uroot –p    # 通过mysql命令行客户端工具,登陆本地MySQL
mysql> create database rmysql;    # 创建一个数据库叫rmysql
Query OK, 1 row affected (0.00 sec)

mysql> grant all on rmysql.* to rmysql@localhost identified by 'rmysql';  # 授权rmysql用户,可以本地操作rmysql库
Query OK, 0 rows affected (0.00 sec)

mysql> use rmysql  # 切换到rmysql库
Database changed

mysql> CREATE TABLE t_user(   # 创建一个表 t_user
    -> id INT PRIMARY KEY AUTO_INCREMENT,
    -> user varchar(12) NOT NULL UNIQUE
    -> )ENGINE=INNODB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO t_user(user) values('A1'),('AB'),('fens.me');  # 插入3条数据
Query OK, 3 rows affected (0.04 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t_user;      # 查询数据
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
+----+---------+
3 rows in set (0.00 sec)

4. 通过R程序,读MySQL数据库数据

~ R    # 启动R程序
> library(RMySQL)    # 加载RMySQL
Loading required package: DBI
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql")  # 建立连接

> users = dbGetQuery(conn, "SELECT * FROM t_user")    # 执行SQL
> users    # 查看数据
  id    user
1  1      A1
2  2      AB
3  3 fens.me

> dbDisconnect(conn)    # 断开连接
[1] TRUE

好了,我们已经实现了在Linux Ubuntu下面R和MySQL的连接。

6.1.3 RMySQL在Win7下安装

Win系统环境:

  • Win7: 64位旗舰版
  • Win字符集: gbk,utf8
  • R: 3.0.1, x86_64-w64-mingw32/x64 (64-bit)
  • MySQL: mysql Ver 14.14 Distrib 5.6.11, for Win64 (x86_64)

1. 查看Win7下的系统环境

~ R --version   # 查看R版本
R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

~ mysql --version   # MySQL版本
mysql  Ver 14.14 Distrib 5.6.11, for Win64 (x86_64)
mysql> show variables like '%char%';    # MySQL字符集
+--------------------------+------------------------------------+
| Variable_name            | Value                              |
+--------------------------+------------------------------------+
| character_set_client     | gbk                                |
| character_set_connection | gbk                                |
| character_set_database   | utf8                               |
| character_set_filesystem | binary                             |
| character_set_results    | gbk                                |
| character_set_server     | utf8                               |
| character_set_system     | utf8                               |
| character_sets_dir       | D:\toolkit\mysql56\share\charsets\ |
+--------------------------+---
8 rows in set (0.07 sec)

2. 在R环境中安装RMySQL

~ R    # 启动R程序
> install.packages('RMySQl')    # 安装RMySQL
package 'RMySQl' is not available (for R version 3.0.1)

又遇到错误,提示没有对应的RMySQL安装版本。

由于RMySQL没有提供Windows版本的发行包,所以我们必须在Windows中手动编译RMySQL的包后,再进行安装。

> install.packages("RMySQL", type="source")    # 下载RMySQL源代码包
URLhttp://cran.dataguru.cn/src/contrib/RMySQL_0.9-3.tar.gz'
Content type 'application/x-gzip' length 165363 bytes (161 Kb)
URL
downloaded 161 Kb

找到源代码包:RMySQL_0.9-3.tar.gz

~ dir C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages
2013-09-24  13:16           165,363 RMySQL_0.9-3.tar.gz

通过源代码包安装

~ D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3.tar.gz
* installing to library 'C:/Program Files/R/R-3.0.1/library'
* installing *source* package 'RMySQL' ...
** 'RMySQL'MD5
checking for $MYSQL_HOME... not found... searching registry...

# 省略部分输出

readRegistry("SOFTWARE\\MySQL AB", hive = "HLM", maxdepth = 2) :
  Registry key 'SOFTWARE\MySQL AB' not found


ERROR: configuration failed for package 'RMySQL'
* removing 'C:/Program Files/R/R-3.0.1/library/RMySQL'

又出错了,我们还需要设置MYSQL_HOME的环境变量。

~ set MYSQL_HOME=D:\toolkit\mysql56

注: MYSQL_HOME建议设置在系统环境变量中。

再一次安装RMySQL

D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3.tar.gz
# 省略部分输出
gcc.exe: error: D:\toolkit\mysql56/bin/libmySQL.dll: No such file or directory
ERROR: compilation failed for package 'RMySQL'
* removing 'C:/Program Files/R/R-3.0.1/library/RMySQL'

错误不断,这次提示为没有找到动态链接库文件 D:\toolkit\mysql56/bin/libmySQL.dll 。

# 复制动态链接库libmySQL.dll
cp D:\toolkit\mysql56\lib\libmysql.dll D:\toolkit\mysql56\bin\
mv D:\toolkit\mysql56\bin\libmysql.dll D:\toolkit\mysql56\bin\libmySQL.dll

再一次安装RMySQL

~ D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3.tar.gz
# 省略部分输出
installing to C:/Program Files/R/R-3.0.1/library/RMySQL/libs/x64
** R
** inst
** preparing package for lazy loading
Creating a generic function for 'format' from package 'base' in package 'RMySQL'
Creating a generic function for 'print' from package 'base' in package 'RMySQL'
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
MYSQL_HOME defined as D:\toolkit\mysql56
* DONE (RMySQL)

终于安装成功了,值得庆祝!

3. 在MySQL中建库建表

~ mysql -uroot –p    # 通过mysql命令行客户端工具,登陆本地MySQL
mysql> create database rmysql;    # 创建一个数据库叫rmysql
Query OK, 1 row affected (0.00 sec)

~ mysql -uroot –p    # 通过mysql命令行客户端工具,登陆本地MySQL
mysql> create database rmysql;    # 创建一个数据库叫rmysql
Query OK, 1 row affected (0.00 sec)


mysql> grant all on rmysql.* to rmysql@localhost identified by 'rmysql';  # 授权rmysql用户,可以本地操作rmysql库
Query OK, 0 rows affected (0.00 sec)

mysql> use rmysql  # 切换到rmysql库
Database changed

mysql> CREATE TABLE t_user(   # 创建一个表 t_user
    -> id INT PRIMARY KEY AUTO_INCREMENT,
    -> user varchar(12) NOT NULL UNIQUE
    -> )ENGINE=INNODB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO t_user(user) values('A1'),('AB'),('fens.me');  # 插入3条数据
Query OK, 3 rows affected (0.04 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t_user;      # 查询数据
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
+----+---------+
3 rows in set (0.00 sec)

4. 通过R程序,读MySQL数据库数据

~ R    # 启动R程序
> library(RMySQL)    # 加载RMySQL
Loading required package: DBI
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql")  # 建立连接

> users = dbGetQuery(conn, "SELECT * FROM t_user")    # 执行SQL
> users    # 查看数据
  id    user
1  1      A1
2  2      AB
3  3 fens.me

> dbDisconnect(conn)    # 断开连接
[1] TRUE

好了,我们已经实现了在Linux Ubuntu下面R和MySQL的连接。

6.1.3 RMySQL在Win7下安装

Win系统环境:

  • Win7: 64位旗舰版
  • Win字符集: gbk,utf8
  • R: 3.0.1, x86_64-w64-mingw32/x64 (64-bit)
  • MySQL: mysql Ver 14.14 Distrib 5.6.11, for Win64 (x86_64)

1. 查看Win7下的系统环境

~ R --version   # 查看R版本
R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

~ mysql --version   # MySQL版本
mysql  Ver 14.14 Distrib 5.6.11, for Win64 (x86_64)
mysql> show variables like '%char%';    # MySQL字符集
+--------------------------+------------------------------------+
| Variable_name            | Value                              |
+--------------------------+------------------------------------+
| character_set_client     | gbk                                |
| character_set_connection | gbk                                |
| character_set_database   | utf8                               |
| character_set_filesystem | binary                             |
| character_set_results    | gbk                                |
| character_set_server     | utf8                               |
| character_set_system     | utf8                               |
| character_sets_dir       | D:\toolkit\mysql56\share\charsets\ |
+--------------------------+------------------------------------+
8 rows in set (0.07 sec)

2. 在R环境中安装RMySQL

~ R    # 启动R程序
> install.packages('RMySQl')    # 安装RMySQL
package 'RMySQl' is not available (for R version 3.0.1)

又遇到错误,提示没有对应的RMySQL安装版本。

由于RMySQL没有提供Windows版本的发行包,所以我们必须在Windows中手动编译RMySQL的包后,再进行安装。

> install.packages("RMySQL", type="source")    # 下载RMySQL源代码包
URLhttp://cran.dataguru.cn/src/contrib/RMySQL_0.9-3.tar.gz'
Content type 'application/x-gzip' length 165363 bytes (161 Kb)
URL
downloaded 161 Kb

找到源代码包:RMySQL_0.9-3.tar.gz

~ dir C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages
2013-09-24  13:16           165,363 RMySQL_0.9-3.tar.gz

通过源代码包安装

~ D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3.tar.gz
* installing to library 'C:/Program Files/R/R-3.0.1/library'
* installing *source* package 'RMySQL' ...
** 'RMySQL'MD5
checking for $MYSQL_HOME... not found... searching registry...

# 省略部分输出
readRegistry("SOFTWARE\\MySQL AB", hive = "HLM", maxdepth = 2) :
  Registry key 'SOFTWARE\MySQL AB' not found

ERROR: configuration failed for package 'RMySQL'
* removing 'C:/Program Files/R/R-3.0.1/library/RMySQL'

又出错了,我们还需要设置MYSQL_HOME的环境变量。

~ set MYSQL_HOME=D:\toolkit\mysql56

注: MYSQL_HOME建议设置在系统环境变量中。

再一次安装RMySQL

D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3.tar.gz
# 省略部分输出
gcc.exe: error: D:\toolkit\mysql56/bin/libmySQL.dll: No such file or directory
ERROR: compilation failed for package 'RMySQL'
* removing 'C:/Program Files/R/R-3.0.1/library/RMySQL'

错误不断,这次提示为没有找到动态链接库文件 D:\toolkit\mysql56/bin/libmySQL.dll 。

# 复制动态链接库libmySQL.dll
cp D:\toolkit\mysql56\lib\libmysql.dll D:\toolkit\mysql56\bin\
mv D:\toolkit\mysql56\bin\libmysql.dll D:\toolkit\mysql56\bin\libmySQL.dll

再一次安装RMySQL

~ D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3.tar.gz
# 省略部分输出
installing to C:/Program Files/R/R-3.0.1/library/RMySQL/libs/x64
** R
** inst
** preparing package for lazy loading
Creating a generic function for 'format' from package 'base' in package 'RMySQL'
Creating a generic function for 'print' from package 'base' in package 'RMySQL'
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
MYSQL_HOME defined as D:\toolkit\mysql56
* DONE (RMySQL)

终于安装成功了,值得庆祝!

3. 在MySQL中建库建表

操作过程与在Linux中操作过程一致。

~ mysql -uroot -p

mysql> create database rmysql;
Query OK, 1 row affected (0.04 sec)

mysql> grant all on rmysql.* to rmysql@'%' identified by 'rmysql';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all on rmysql.* to rmysql@localhost identified by 'rmysql';
Query OK, 0 rows affected (0.00 sec)

mysql> use rmysql
Database changed
mysql> CREATE TABLE t_user(
    -> id INT PRIMARY KEY AUTO_INCREMENT,
    -> user varchar(12) NOT NULL UNIQUE
    -> )ENGINE=INNODB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (1.01 sec)

mysql>
mysql> INSERT INTO t_user(user) values('A1'),('AB'),('fens.me');
Query OK, 3 rows affected (0.05 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t_user;
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
+----+---------+
3 rows in set (0.03 sec)

4. 通过R程序,读MySQL数据库数据。

如果刚才没有把MYSQL_HOME的变量写到环境变更中,每次在启动R之前都要设置这个变量。

~ set MYSQL_HOME=D:\toolkit\mysql56    # 在当前环境中,设置环境变量
~ R    # 启动R程序
> library(RMySQL)    # 加载RMySQL
DBI
MYSQL_HOME defined as D:\toolkit\mysql56
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql")  # 建立连接
> users = dbGetQuery(conn, "SELECT * FROM t_user")    # 执行SQL
> users
  id    user
1  1      A1
2  2      AB
3  3 fens.me

> dbDisconnect(conn)  # 断开连接
[1] TRUE

好了,我们又实现了在Win7下的R和MySQL的连接

6.1.4 RMySQL函数使用

环境都安装好了,接下来我们就具体使用RMySQL包了。本节主要介绍RMySQL辅助操作、数据库操作以及win的字符集设置。

1. RMySQL辅助函数

> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql",  # 建立本地连接
password="rmysql",client.flag=CLIENT_MULTI_STATEMENTS)
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", # 建立远程连接
password="rmysql",host="192.168.1.201",port=3306)
> dbDisconnect(conn)    # 关闭连接
> dbListTables(conn)    # 查看数据库中所有的表
[1] "t_user"
> dbListFields(conn, "t_user")    # 查看表的字段
[1] "id"   "user"
> summary(MySQL(), verbose = TRUE)    # 查询MySQL信息
<MySQLDriver:(23864)>
  Driver name:  MySQL
  Max  connections: 16
  Conn. processed: 3
  Default records per fetch: 500
  DBI API version:
> summary(conn, verbose = TRUE)    # MySQL连接实例信息
<MySQLConnection:(23864,2)>
  User: root
  Host: localhost
  Dbname: rmysql
  Connection type: localhost via TCP/IP
  MySQL server version:  5.6.11
  MySQL client version:  5.6.11
  MySQL protocol version:  10
  MySQL server thread id:  35
  No resultSet available

> dbListConnections(MySQL())    # MySQL连接信息
[[1]]
<MySQLConnection:(23864,2)>

2. RMySQL数据库操作

> t_demo<-data.frame(    # 建表并插入数据
  a=seq(1:10),
  b=letters[1:10],
  c=rnorm(10)
)
> dbWriteTable(conn, "t_demo", t_demo)

> dbReadTable(conn, "t_demo")    # 获得整个表数据
    a b           c
1   1 a  0.98868164
2   2 b -0.66935770
3   3 c  0.27703638
4   4 d  1.36137156
5   5 e -0.70291017
6   6 f  1.61235088
7   7 g  0.17616068
8   8 h  0.29700017
9   9 i  0.19032719
10 10 j -0.06222173

> dbWriteTable(conn, "t_demo", t_demo, append=TRUE)    # 插入新数据
> dbReadTable(conn, "t_demo")
   row_names  a b           c
1          1  1 a  0.98868164
2          2  2 b -0.66935770
3          3  3 c  0.27703638
4          4  4 d  1.36137156
5          5  5 e -0.70291017
6          6  6 f  1.61235088
7          7  7 g  0.17616068
8          8  8 h  0.29700017
9          9  9 i  0.19032719
10        10 10 j -0.06222173
11         1  1 a  0.98868164
12         2  2 b -0.66935770
13         3  3 c  0.27703638
14         4  4 d  1.36137156
15         5  5 e -0.70291017
16         6  6 f  1.61235088
17         7  7 g  0.17616068
18         8  8 h  0.29700017
19         9  9 i  0.19032719
20        10 10 j -0.06222173

 dbWriteTable(conn, "t_demo", t_demo, overwrite=TRUE)    # 覆盖原表数据
> d0 <- dbGetQuery(conn, "SELECT * FROM t_demo where c>0")    # 查询数据
> class(d0)
[1] "data.frame"

> d0
  row_names a b         c
1         1 1 a 0.9886816
2         3 3 c 0.2770364
3         4 4 d 1.3613716
4         6 6 f 1.6123509
5         7 7 g 0.1761607
6         8 8 h 0.2970002
7         9 9 i 0.1903272

> rs <- dbSendQuery(conn, "SELECT * FROM t_demo where c>0")   # 执行SQL脚本查询,并分页
> class(rs)
[1] "MySQLResult"
attr(,"package")
[1] "RMySQL"
> mysqlCloseResult(rs)
[1] TRUE

> d1 <- fetch(rs, n = 3)
> d1
  row_names a b         c
1         1 1 a 0.9886816
2         3 3 c 0.2770364
3         4 4 d 1.3613716

> summary(rs, verbose = TRUE)    # 查看集统计信息
  row_names               a              b                   c
 Length:7           Min.   :1.000   Length:7           Min.   :0.1762
 Class :character   1st Qu.:3.500   Class :character   1st Qu.:0.2337
 Mode  :character   Median :6.000   Mode  :character   Median :0.2970
                    Mean   :5.429                      Mean   :0.7004
                    3rd Qu.:7.500                      3rd Qu.:1.1750
                    Max.   :9.000                      Max.   :1.6124

> dbWriteTable(conn, "t_demo", t_demo,row.names=FALSE,overwrite=TRUE)  # 不插入row.names字段
> dbGetQuery(conn, "SELECT * FROM t_demo where c>0")
  a b         c
1 1 a 0.9886816
2 3 c 0.2770364
3 4 d 1.3613716
4 6 f 1.6123509
5 7 g 0.1761607
6 8 h 0.2970002
7 9 i 0.1903272

> if(dbExistsTable(conn,'t_demo')){  # 删除表
+     dbRemoveTable(conn, "t_demo")
+ }
[1] TRUE
执行SQL语句,dbSendQuery
> query<-dbSendQuery(conn, "show tables")
> data <- fetch(query, n = -1)
> data
  Tables_in_rmysql
1           t_demo
2           t_user
> mysqlCloseResult(query)
[1] TRUE

特别提示,尽量避免用dbWriteTable()函数,因为它会删除你原来定义的表结构,创建新的表结构再插入数据!

3. Windows的字符集设置

在Win7中,向MySQL数据库中插入中文时,由于字符编码是gbk,所以会出现中文的乱码。下面通过MySQL命令行客户端插入中文。

mysql> INSERT INTO t_user(user) values('小朋友'),('你好'),('正确了');  # 插入中文数据
Query OK, 3 rows affected (0.07 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from t_user;  # 查询数据,中文显示正确
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
|  5 | 你好    |
|  4 | 小朋友  |
|  6 | 正确了  |
+----+---------+
6 rows in set (0.07 sec)

通过RMySQL查询

> dbGetQuery(conn, "SELECT * FROM t_user")  # 查询数据,中文显示为乱码
  id    user
1  1      A1
2  2      AB
3  3 fens.me
4  5      ??
5  4     ???
6  6     ???

在R的API中,设置GKB字符集。

> dbDisconnect(conn)
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="root", password="",client.flag=CLIENT_MULTI_STATEMENTS)
> dbSendQuery(conn,'SET NAMES gbk')   # 设置GKB字符集
> query<-dbSendQuery(conn, "SELECT * FROM t_user")  # 查询数据
> data <- fetch(query, n = -1)
> mysqlCloseResult(query)
[1] TRUE

> data   # 中文显示正确
  id    user
1  1      A1
2  2      AB
3  3 fens.me
4  5    你好
5  4  小朋友
6  6  正确了

我们在Win7下面修正字符编号的问题。

6.1.5 RMySQL案例实践

接下来,我们再做一个远程数据库连接的案例。在Win7的环境使用RMySQL客户端连接远程Linux上的MySQL数据库服务器。

1. 在远程数据库创建表

通过SQL语句,先在MySQL数据库建一张新表t_blog,主键索引,唯一键索引。下面是建表语句。

# 建表
mysql> CREATE TABLE t_blog(
id INT PRIMARY KEY AUTO_INCREMENT,
title varchar(12) NOT NULL UNIQUE,
author varchar(12) NOT NULL,
length int NOT NULL,
create_date timestamp NOT NULL DEFAULT now()
)ENGINE=INNODB DEFAULT CHARSET=UTF8;

mysql> desc t_blog;   # 查看表结构
+-------------+-------------+------+-----+-------------------+----------------+
| Field       | Type        | Null | Key | Default           | Extra          |
+-------------+-------------+------+-----+-------------------+----------------+
| id          | int(11)     | NO   | PRI | NULL              | auto_increment |
| title       | varchar(12) | NO   | UNI | NULL              |                |
| author      | varchar(12) | NO   |     | NULL              |                |
| length      | int(11)     | NO   |     | NULL              |                |
| create_date | timestamp   | NO   |     | CURRENT_TIMESTAMP |                |
+-------------+-------------+------+-----+-------------------+----------------+
5 rows in set (0.00 sec)

mysql> show indexes from t_blog;   # 查看表的索引
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_blog |          0 | PRIMARY  |            1 | id          | A         |           3 |     NULL | NULL   |      | BTREE      |         |               |
| t_blog |          0 | title    |            1 | title       | A         |           3 |     NULL | NULL   |      | BTREE      |         |               |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)

# 插入数据
mysql> INSERT INTO t_blog(title,author,length) values('你好,第一篇','Conan',20),('RMySQL数据库编程','Conan',99),('R的极客理想系列文章','Conan',15);

mysql> select * from t_blog;  # 查询表
+----+------------------------------+--------+--------+---------------------+
| id | title                        | author | length | create_date         |
+----+------------------------------+--------+--------+---------------------+
|  1 | 你好,第一篇                 | Conan  |     20 | 2013-08-15 00:13:13 |
|  2 | RMySQL数据库编程             | Conan  |     99 | 2013-08-15 00:13:13 |
|  3 | R的极客理想系列文章          | Conan  |     15 | 2013-08-15 00:13:13 |
+----+------------------------------+--------+--------+---------------------+
3 rows in set (0.00 sec)

2. RMySQL访问远程MySQL

用RMySQL远程访问数据库,并插入数据,包括中文字段,再取出数据。

> library(RMySQL)

# 建立远程连接
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql",host="192.168.1.201",port=3306)

> dbSendQuery(conn,'SET NAMES gbk')  # 设置gbk字符编码

# 执行SQL,插入中文数据
> dbSendQuery(conn,"INSERT INTO t_blog(title,author,length) values('R插入的新文章','Conan',50)");


> query<-dbSendQuery(conn, "SELECT * FROM t_blog")  # 查询数据
Warning message:
In mysqlExecStatement(conn, statement, ...) :
  RS-DBI driver warning: (unrecognized MySQL field type 7 in column 4 imported as character)

> data <- fetch(query, n = -1)  # 查询数据
> mysqlCloseResult(query)
[1] TRUE

> print(data)  # 中文显示正常
  id               title author length         create_date
1  1        你好,第一篇  Conan     20 2013-08-15 00:13:13
2  2    RMySQL数据库编程  Conan     99 2013-08-15 00:13:13
3  3 R的极客理想系列文章  Conan     15 2013-08-15 00:13:13
4  4       R插入的新文章  Conan     50 2013-08-15 00:29:45
>
> dbDisconnect(conn)
[1] TRUE

掌握RMySQL的各种使用技巧并且理解原理,就能少犯错误,提高工作效率!

results matching ""

    No results matching ""