Wednesday, December 26, 2018

Tech Notes:- log file at /var/log/ansible.log is not writeable and we cannot create it, aborting

This is one of the errors that we encounter after installing ansible and trying to run our first playbook as a non-root user. To fix this all that we need to do is to create a group called ansible , add the user who is executing the playbook to the ansible group, then touch the file /var/log/asnible.log with root as the owner and ansible as the group.Permission should be 775 which is read write and execute for  owner and group and read and execute for others.

# sudo touch /var/log/ansible.log

#sudo groupadd ansible

# sudo chown root:ansible /var/log/ansible.log

# sudo chmod 775 /var/log/ansible.log

# sudo usermod sudeep -aG ansible

You need to signout and sign in before executing the playbook.


Sunday, December 16, 2018

Tech Notes:- Mysql Replication Error 'Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file''

I kept my slave mysql db down for a week and then when I switched on the slave db server and checked the replication status, I started seeing this error.



Last_IO_Errno: 1236

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'



The most probable reason for this error is that the binary log required for the replication by slave servers no longer exists on the master database server. This could have happened because someone might have manually deleted the binary files on the MySQL master either by removing the files from the file system or using the purge binary logs command.

Another reason could be that a variable has been set in the conf file to expire the logs. In my case, it was this one and the variable was set to 2 days

expire_logs_days=2

Let's have a detailed look at SHOW SLAVE STATUS and SHOW BINARY LOGS on master

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 172.25.1.20
                  Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000023
          Read_Master_Log_Pos: 94955731
               Relay_Log_File: mysqld-relay-bin.000007
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.000023
             Slave_IO_Running: No
            Slave_SQL_Running: Yes


mysql> SHOW BINARY LOGS;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
| mysql-bin.000029 |      1315 |
+------------------+-----------+
1 row in set (0.21 sec)

So reading it all to gether, slave is looking for the master log file mysql-bin.000023 at position 94955731. But the master is at  mysql-bin.000029  and I dont have the file mysql-bin.000023. So this basically means that the mysql slave stopped replcation while at mysql-bin.000023.

How do we solve this?

The simplest way as well as the advisable approach is to dump the mysql from the master, restore it to slave and then restart the replication. But that may take time if the database is huge and there is no way to take the dump without impacting production performance.

If its a test system and you are ok with losing some data in replication, the best bet would be to set the position of the slave to the start of the new available master file.  

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000029', MASTER_LOG_POS=4;
START SLAVE;

You may need to skip some positions sometimes. If you have patience and time, the best option is to examine the events in the log and point to the position accordingly. You can do it using the command below

sudo mysqlbinlog /var/lib/mysql/mysql-bin.000029 | more