融合图片的魔法（泊松图像编辑）

0x3喵酱

2018-11-16 22:52:27

Personal

![](https://cdn.luogu.com.cn/upload/pic/43643.png) 有位dalao在APIO2018的时候讲解了这个算法，只可惜本蒟蒻当初并没有听懂太多，不过下来又研究研究算是研究懂了。本文当中的内容全部是作者自己理解的内容，如果发现错误欢迎指出，我将立马纠正QAQ。本文中只讲了泊松图像编辑中的泊松图像融合，其它的内容因为我还没有理解所以就不写了QAQ。注：本文中的代码全部使用Java实现，因为用Java实现可以更简单（本人是这样认为的QAQ），但是事实上阅读本文可以不需要Java基础（因为Java是一种类C语言，在掌握C/C++的时候在看Java代码基本还是可以理解的）。 **前置知识：** * 矩阵/向量的概念 * 高斯消元 **参考文献/文章：** * [Poisson Image Editing](https://www.cs.virginia.edu/~connelly/class/2014/comp_photo/proj2/poisson.pdf) * [Seamless cloning泊松克隆-Siggraph 2004](https://blog.csdn.net/hjimce/article/details/45716603) * APIO的课件（QAQ我并没有找到下载链接） * [百度百科-雅克比迭代](https://baike.baidu.com/item/%E9%9B%85%E5%85%8B%E6%AF%94%E8%BF%AD%E4%BB%A3%E6%B3%95/8736410) * [迭代法求解线性方程组的收敛问题总结](https://blog.csdn.net/Reborn_Lee/article/details/80958869) ## 目录 * 关于Java的一些简单说明 * 图像梯度和散度 * 泊松图像融合 * 泊松重建 * 雅克比迭代法 * 小结&完整代码&失误产生的图片 ## 关于Java的一些简单说明在Java中，程序是从主类中的主函数开始运行。下面给出一个主类的示范：（如果你不清楚“类”是什么，你可以把这里的“类”当做“结构体”） ```cpp public class Main//主类的定义，名称不一定是Main { public static void main(String[] args)//这是主函数，args是运行参数 { System.out.printf("Hello World!");//这里相当于C/C++的printf("Hello World!"); } } ``` 在Java中，所有函数都必须在“类”里面定义，而不能在“类”的外面。对于类里面的一个全局函数/变量都分为静态和非静态，静态的可以直接调用，非静态的只能在“对象”中调用（如果你不清楚“对象”是什么，你可以把它当做“结构体变量”）。静态的函数/变量都会有`static`关键字修饰。下面是示范代码： ```cpp public static Main { public static void main(String[] args) { fun1();//静态的可以直接调用 Main main = new Main();//新建主类“对象” //在Java中新建一个对象并不需要用指针 //Java没有指针（但是非要说的话其实Java中所有的对象都是指针） //如果你看不懂上面两句话在讲什么可以无视 main.fun2();//在对象中调用fun2 } public static void fun1()//静态函数fun1 { } public void fun2()//非静态函数fun2 { } } ``` 关于Java中注释的部分说明： ```cpp /** * 这里是对函数的说明 * * @param a 这里是对函数的参数a的说明 * @param b 这里是对函数的参数b的说明 */ public void fun(int a,int b) { } ``` 关于Java中数组的说明： ```cpp public static void main(String[] args) { int[] a = new int[10];//10为数组长度 //我们可以用a.length表示a的数组长度，它的值为10 int[][] b = new int[10][10];//b为10*10的二维数组 } ``` 下文中使用的`Color`类的代码： ```cpp public class Color { public int a; public int[] rgb = new int[3]; public Color(int r, int g, int b, int a)//构造函数 { this.a = a;//将类中的非静态全局变量a的值赋值为构造函数的参数a //这里的a表示的是alpha值，即像素的透明度 rgb[0] = r; rgb[1] = g; rgb[2] = b; } } ``` ## 图像梯度和散度由于梯度/散度在数学上的定义比较复杂，在此我就只讲一下如何计算梯度/散度，以及如何把梯度化为散度。我们知道，普通的RGB位图上的每一个点都是用一个三维的向量来表示，向量的 $ x,y,z $ 三个坐标分别对应了红色，蓝色，绿色。对于每一个点的每一种颜色（RGB）来说都有不同的梯度/散度，而且每种颜色的梯度/散度都是相对独立的，所以接下来我们就只对于单独一种颜色讲。在下文中，我们使用 $l(x,y)$ 来表示原位图中 $ (x,y) $ 这个点的颜色。 **梯度的计算** 梯度是一个二维向量，它的计算公式相当简单（我们用 $ G(x,y) $ 来表示 $ (x,y) $ 这个点的梯度）。 $ G(x,y) = \left[ \begin{matrix} l(x,y)-l(x-1,y) \\ l(x,y)-l(x,y-1)\end{matrix} \right]$ 代码： ```cpp /** * 获取梯度向量的横坐标 * * @param a 颜色数组 * @param type 表示颜色类型，0-R 1-G 2-B */ public static int[][] getGradientX(Color[][] a, int type) { int[][] result = new int[a.length][a[0].length]; int i, j, w = a.length, h = a[0].length; for (i = 0; i < w; i++) { for (j = 0; j < h; j++) { if (i == 0) result[i][j] = a[i][j].rgb[type]; else result[i][j] = a[i][j].rgb[type] - a[i - 1][j].rgb[type]; } } return result; } /** * 获取梯度向量的纵坐标 * * @param a 颜色数组 * @param type 表示颜色类型，0-R 1-G 2-B */ public static int[][] getGradientY(Color[][] a, int type)//获取梯度向量的纵坐标 { int[][] result = new int[a.length][a[0].length]; int i, j, w = a.length, h = a[0].length; for (i = 0; i < w; i++) { for (j = 0; j < h; j++) { if (j == 0) result[i][j] = a[i][j].rgb[type]; else result[i][j] = a[i][j].rgb[type] - a[i][j - 1].rgb[type]; } } return result; } ``` **散度的计算** 散度是一个标量，它的计算公式也是非常简单（我们用 $ div(x,y) $ 来表示 $ (x,y) $ 这个点的散度）。 $ div(x,y) = l(x-1,y)+l(x+1,y)+l(x,y-1)+l(x,y+1)-4l(x,y) $ 由于这个算法并不需要直接算散度，所以这里就不贴代码了。 **把梯度化为散度** 相信这个大家应该也是可以做到自己推出来的，下面就给以下公式（这里用 $ G(x,y)_x $ 表示 $ G(x,y) $ 这个向量的横坐标，用 $ G(x,y)_y $ 表示 $ G(x,y) $ 这个向量的纵坐标）： $ div(x,y) = G(x+1,y)_x + G(x,y+1)_y - G(x,y)_x-G(x,y)_y $ ```cpp /** * 把梯度转化为散度的函数 * * @param gx 所有梯度的横坐标构成的数组 * @param gy 所有梯度的纵坐标构成的数组 */ public static int[][] getDivergence(int gx[][], int gy[][]) { int i, j, w = gx.length, h = gx[0].length; int[][] result = new int[w][h]; for (i = 0; i < w; i++) { for (j = 0; j < h; j++) { result[i][j] = 0; if (i == w - 1) result[i][j] += -gx[i][j]; else result[i][j] += gx[i + 1][j] - gx[i][j]; if (j == h - 1) result[i][j] += -gy[i][j]; else result[i][j] += gy[i][j + 1] - gy[i][j]; } } return result; } ``` ## 泊松图像融合这是背景图片 $ S $ 的梯度场： ![](https://cdn.luogu.com.cn/upload/pic/43651.png) 我们现在要将另一个图片 $ g $ 的被黄色线条圈起来的区域融合到 $ S $ 内。 ![](https://cdn.luogu.com.cn/upload/pic/43653.png) 其实这个融合也很简单，直接将 $ g $ 中圈起来的部分替换掉原图像 $ S $ 中中相应的部分就可以了。这个过程就是泊松图像融合。 ![](https://cdn.luogu.com.cn/upload/pic/43663.png) ## 泊松重建我们已经知道如何计算一个图片的梯度和散度，以及如何把两个图片的梯度融合，可是我们还不知道如何把梯度还原成图片。把梯度还原成图片其实也并不是一件难事，第一步是把梯度转化成散度，至于怎么转化上文已经说过了。那么现在的问题就是我已知一个散度图，以及部分像素（离融合部分较远的地方像素可以看做不变）如何把它还原成原图。 $ \left[ \begin{matrix} div(1,1)\ div(1,2) \ div(1,3) \ div(1,4) \\ div(2,1)\ div(2,2) \ div(2,3) \ div(2,4) \\ div(3,1)\ div(3,2) \ div(3,3) \ div(3,4) \\ div(4,1)\ div(4,2) \ div(4,3) \ div(4,4) \\ \end{matrix} \right] $ 这是一个由散度构成的矩阵，我们知道根据散度的定义有： $ div(x,y) = l(x-1,y)+l(x+1,y)+l(x,y-1)+l(x,y+1)-4l(x,y) $ 那么也就是说，我可以把 $ l(x,y) $ 看做一个未知数，然后我们根据已有的条件必然可以列出来一些方程组。（假设我们现在使用的例子是已知最外圈的像素） $ l(1,2)+l(2,1)+l(3,2)+l(2,3)-4l(2,2)=div(2,2) $ $ l(1,3)+l(2,2)+l(3,3)+l(2,4)-4l(2,3)=div(2,3) $ $ l(2,2)+l(3,1)+l(4,2)+l(3,3)-4l(3,2)=div(3,2) $ $ l(2,3)+l(3,2)+l(4,3)+l(3,4)-4l(3,3)=div(3,3) $ 那么根据这个我们可以得到什么呢？很显然，我们可以令一个矩阵的第 $ (4x+y) $ 列表示 $ l(x,y) $ 这个未知数然后列出来增广矩阵（由于增广矩阵很大，这里就不画了）。在增广矩阵中，对于已知的像素都是在它本身的那一列为 $ 1 $ ，最后一列为代表它的像素值，对于未知的像素，在它本身那一列为 $ -4 $ ，表示它上下左右那几列为 $ 1 $，最后一列为它的散度。 ~~那么我们可以对它进行高斯消元。~~ 真的可以高斯消元吗？如果我们要处理一个 $ 100 \times 100 $ 这么小的一张图片，高斯消元存在 $ 10^4 $ 个元，根据高斯消元 $ O(n^3) $ 的复杂度，那么就需要计算超过 $ 10^{12} $ 次（而且还没算常数）！而对于 $ 10^{12} $ 次运算，我们假设你的电脑每秒运行 $ 5\times 10^8 $ 次（已经很快了吧），也就是需要运行 $ 2000s $ 。所以，对于 $ 100 \times 100$ 这么小的一张图片，用高斯消元都需要算半个小时！最终的结论肯定是不能使用高斯消元，那么接下来我们将引入一种全新的方法来求解线性方程组。 ## 雅克比迭代法雅克比迭代法是一种通过迭代的方式来逼近线性方程组的解的方法。关于它的证明如果各位感兴趣的话可以进行百度，这里就不证明了。我们用向量 $ x^{(k)} $ 表示第 $ k $ 次迭代后得到的方程的近似解，我们令 $ x^{(0)}= \vec 0 $ 。对于原本的系数矩阵，我们把它分为三部分 $ L,U,D $。其中 $ L $ 表示下三角矩阵，$ D $ 表示对角矩阵，$ U $ 表示上三角矩阵。 ![](https://cdn.luogu.com.cn/upload/pic/43666.png) (上图来自百度百科) 我们用向量 $ \vec b $ 来表示增广矩阵的最后一列。雅克比迭代的公式是：$ x^{(k)} = D^{-1} (\vec b-(L+U)x^{(k-1)})$ 。其中 $ D^{-1} $ 表示 $ D $ 矩阵的逆，对于一个仅仅对角线上有元素的矩阵 $ D $ 来说 $ D^{-1}_{ii}=\frac 1 {D_{ii}} $ 。在这里给出矩阵以及雅克比迭代Java版的代码： ```cpp public class Matrix { public final int n, m; private double a[][]; public Matrix(double a[][]) { this.a = a.clone(); n = a.length; m = a[0].length; } public Matrix(int n, int m) { this.n = n; this.m = m; a = new double[n][m]; } /** * 获取矩阵的第i行第j个元素 */ public double numberAt(int i, int j) { return a[i - 1][j - 1]; } /** * 矩阵乘法 */ public static Matrix mul(Matrix A, Matrix B) { if (A.m != B.n) return null; Matrix C = new Matrix(A.n, B.m); int i, j, k; for (i = 0; i < C.n; i++) { for (j = 0; j < C.m; j++) { for (k = 0; k < A.m; k++) { C.a[i][j] += A.a[i][k] * B.a[k][j]; } } } return C; } /** * 矩阵加法 */ public static Matrix plus(Matrix A, Matrix B) { if (A.n != B.n || A.m != B.m) return null; Matrix C = new Matrix(A.n, A.m); int i, j; for (i = 0; i < A.n; i++) { for (j = 0; j < A.m; j++) { C.a[i][j] = A.a[i][j] + B.a[i][j]; } } return C; } /** * 矩阵减法 */ public static Matrix sub(Matrix A, Matrix B) { if (A.n != B.n || A.m != B.m) return null; Matrix C = new Matrix(A.n, A.m); int i, j; for (i = 0; i < A.n; i++) { for (j = 0; j < A.m; j++) { C.a[i][j] = A.a[i][j] - B.a[i][j]; } } return C; } /** * 输出矩阵 */ public void print() { int i, j; for (i = 0; i < n; i++) { for (j = 0; j < m; j++) { System.out.printf("%.2f\t", a[i][j]); } System.out.println(); } } /** * 本函数为雅克比迭代法求解线性方程组 * * @param k 迭代次数 * @param A 系数矩阵（需要保证系数矩阵对角线没有0） * @param b 增广矩阵最后一列 */ public static Matrix jacobi(int k, Matrix A, Matrix b) { if (b.m != 1) return null; if (b.n != A.n) return null; if (A.n != A.m) return null; Matrix L, D, U;//L-下三角矩阵 D-对角线矩阵 U-上三角矩阵 Matrix x = new Matrix(A.n, 1);//迭代向量 L = new Matrix(A.n, A.n); D = new Matrix(A.n, A.n); U = new Matrix(A.n, A.n); int i, j; for (i = 0; i < A.n; i++) { for (j = i - 1; j >= 0; j--) L.a[i][j] = A.a[i][j]; D.a[i][i] = A.a[i][i]; for (j = i + 1; j < A.n; j++) U.a[i][j] = A.a[i][j]; } Matrix I = new Matrix(D.n, D.m);//D的逆矩阵 for (i = 0; i < D.n; i++) I.a[i][i] = 1.0 / D.a[i][i]; Matrix tmp = plus(L, U);//用tmp记录L和U的和 for (i = 1; i <= k; i++) { x = mul(I, sub(b, mul(tmp, x))); System.out.println("第" + i + "次迭代成果："); x.print(); } return x; } } ``` 其实雅克比迭代法经常会出现不收敛的情况，对于它什么时候收敛可以参考文章：[迭代法求解线性方程组的收敛问题总结](https://blog.csdn.net/Reborn_Lee/article/details/80958869) 而在本问题中使用雅克比迭代是收敛的。你可能会说，使用上面的雅克比迭代法仍然没办法对较大的图片进行处理啊？但是请你仔细想想矩阵乘法的过程，在较大的图片中，我们进行雅克比迭代的矩阵一定是一个稀疏矩阵，所以没必要把它的每一个元素都存下来，只需要记录它的非零元素即可。至于如何实现可以参考下一部分的代码。 ## 小结&完整代码&失误产生的图片在这里贴出本人写的完整的代码： `top/hookan/imagemagic/Color.java`： ```cpp package top.hookan.imagemagic; public class Color { public int a; public int[] rgb = new int[3]; public Color(int r, int g, int b, int a) { this.a = a; rgb[0] = r; rgb[1] = g; rgb[2] = b; } } ``` `top/hookan/imagemagic/Main.java`： ```cpp package top.hookan.imagemagic; import javax.imageio.ImageIO; import java.awt.image.BufferedImage; import java.io.File; import java.io.IOException; public class Main { public static void main(String[] args) { File file1 = new File("E:/test-images/testn.jpg"); File file2 = new File("E:/test-images/qqq.png"); File maskf = new File("E:/test-images/test3.png"); File file3 = new File("E:/test-images/qwq.jpg"); //把file2按照maskf里的不透明像素贴到file1上输出到file3 try { Color[][] c1 = readImage(file1); Color[][] c2 = readImage(file2); Color[][] mask = readImage(maskf); int w = c1.length, h = c1[0].length; Color[][] c3 = new Color[w][h]; int i, x, y; for (x = 0; x < w; x++) for (y = 0; y < h; y++) c3[x][y] = new Color(0, 0, 0, 0); for (i = 0; i < 3; i++) { System.out.println("正在计算颜色" + i); //计算梯度图 int[][] gx = getGradientX(c1, i); int[][] gy = getGradientY(c1, i); int[][] gx2 = getGradientX(c2, i); int[][] gy2 = getGradientY(c2, i); //融合梯度图 for (x = 1; x < w - 1; x++) { for (y = 1; y < h - 1; y++) { if (mask[x][y].a != 0) { gx[x][y] = gx2[x][y]; gy[x][y] = gy2[x][y]; } } } int[][] div = getDivergence(gx, gy);//散度图 int[][] LU = new int[w * h][4];//储存矩阵哪些位置为1 double[] I = new double[w * h];//输出对角线矩阵的逆是1还是1/4 double[] b = new double[w * h];//增广矩阵最后一列 for (x = 0; x < w; x++) { for (y = 0; y < h; y++) { if (x == 0 || y == 0 || x == w - 1 || y == h - 1 || mask[x][y].a == 0 && mask[x + 1][y].a == 0 && mask[x - 1][y].a == 0 && mask[x][y + 1].a == 0 && mask[x][y - 1].a == 0) { LU[x * h + y][0] = LU[x * h + y][1] = LU[x * h + y][2] = LU[x * h + y][3] = -1; I[x * h + y] = 1; b[x * h + y] = c1[x][y].rgb[i]; continue; } LU[x * h + y][0] = (x - 1) * h + y; LU[x * h + y][1] = (x + 1) * h + y; LU[x * h + y][2] = x * h + (y - 1); LU[x * h + y][3] = x * h + (y + 1); I[x * h + y] = -0.25; b[x * h + y] = div[x][y]; } } //雅克比迭代，迭代10000次 double[] result = jacobi(10000, LU, I, b); //处理迭代结果放到c3 for (x = 0; x < w; x++) { for (y = 0; y < h; y++) { c3[x][y].rgb[i] = (int) (result[x * h + y] + 0.5); if (c3[x][y].rgb[i] < 0) c3[x][y].rgb[i] = 0; if (c3[x][y].rgb[i] > 255) c3[x][y].rgb[i] = 255; } } } System.out.println("开始输出图片"); //将c3输出成图片 BufferedImage bi = new BufferedImage(w, h, BufferedImage.TYPE_INT_RGB); for (x = 0; x < w; x++) { for (y = 0; y < h; y++) { int pixel = (c3[x][y].rgb[0] << 16) + (c3[x][y].rgb[1] << 8) + (c3[x][y].rgb[2]); bi.setRGB(x, y, pixel); } } ImageIO.write(bi, "jpg", file3); } catch (Exception e) { e.printStackTrace(); } } /** * 读取图片 */ public static Color[][] readImage(File f) throws IOException { BufferedImage bi = ImageIO.read(f); int width = bi.getWidth(); int height = bi.getHeight(); Color[][] result = new Color[width][height]; int i, j; for (i = 0; i < width; i++) { for (j = 0; j < height; j++) { int pixel = bi.getRGB(i, j); result[i][j] = new Color((pixel & 0xFF0000) >> 16, (pixel & 0xFF00) >> 8, (pixel & 0xFF), ((pixel & 0xFF000000) >> 24) & 0xFF); } } return result; } public static int[][] getGradientX(Color[][] a, int type) { int[][] result = new int[a.length][a[0].length]; int i, j, w = a.length, h = a[0].length; for (i = 0; i < w; i++) { for (j = 0; j < h; j++) { if (i == 0) result[i][j] = a[i][j].rgb[type]; else result[i][j] = a[i][j].rgb[type] - a[i - 1][j].rgb[type]; } } return result; } public static int[][] getGradientY(Color[][] a, int type) { int[][] result = new int[a.length][a[0].length]; int i, j, w = a.length, h = a[0].length; for (i = 0; i < w; i++) { for (j = 0; j < h; j++) { if (j == 0) result[i][j] = a[i][j].rgb[type]; else result[i][j] = a[i][j].rgb[type] - a[i][j - 1].rgb[type]; } } return result; } public static int[][] getDivergence(int gx[][], int gy[][]) { int i, j, w = gx.length, h = gx[0].length; int[][] result = new int[w][h]; for (i = 0; i < w; i++) { for (j = 0; j < h; j++) { result[i][j] = 0; if (i == w - 1) result[i][j] += -gx[i][j]; else result[i][j] += gx[i + 1][j] - gx[i][j]; if (j == h - 1) result[i][j] += -gy[i][j]; else result[i][j] += gy[i][j + 1] - gy[i][j]; } } return result; } public static double[] jacobi(int n, int[][] LU, double[] I, double[] b) { int len = b.length; double[] x = new double[len]; double[] tmp = new double[len]; int i, j, k; for (i = 0; i < n; i++) { if (i % 1000 == 0) System.out.println("已经迭代" + 100.0 * i / n + "%"); for (j = 0; j < len; j++) { tmp[j] = 0.0; if (LU[j][0] == -1) continue; for (k = 0; k < 4; k++) { if (LU[j][k] >= 0 && LU[j][k] < len) tmp[j] += x[LU[j][k]]; } } for (j = 0; j < len; j++) tmp[j] = b[j] - tmp[j]; for (j = 0; j < len; j++) { x[j] = I[j] * tmp[j]; } } return x; } } ``` [本人代码示例里面使用的图片](https://blog.hookan.top/test/images.zip)。到这里大概就讲完了，本文中唯一对OI有用的内容可能就是雅克比迭代法了（虽然我还没见过要求求解非常大的线性方程组近似解的题（可能是我做的题太少了）（其实还有比雅克比迭代法更快的迭代法但是我没学会QAQ））。目前对于这个算法本人还有一些小小的疑问就是为什么不直接使用梯度值进行消元而要先把梯度化为散度进行消元，本人目前还没有找到解释这块的文章，如果大家发现解释告诉我也是可以的QAQ（本人猜测可能是让结果更准确？）。然后就是关于本人代码中使用的雅克比迭代其实是没必要把大量已知像素放到矩阵里的（废话），但是本人懒得写了QAQ。然后再贴一张失败时产生的图片QAQ： ![](https://cdn.luogu.com.cn/upload/pic/43675.png) 一不小心敲错系数矩阵产生的图片（之前我还以为是雅克比没收敛233）。吐槽：洛谷博客居然不支持Java高亮，所以我只能用cpp高亮了QAQ。 End.